Re: posting binary file and metadata in two separate documents

2009-07-17 Thread rossputin

Hi.

Thanks for your reply, shame nobody has already implemented the multiple
'ContentStreams' idea :-)
With regards to posting in a form, I had considered that, but unfortunately
there can be an arbitrary number of 'ext.literals', so it would be difficult
to build a form which would handle all cases.

Regards,

 -- Ross


hossman wrote:
 
 
 : Subject: posting binary file and metadata in two separate documents
 
 there was some discussion a while back about that fact that you can push 
 multiple ContentStreams to SOlr in a single request, and while the 
 existing handelrs all just iterate over and process them seperately, it 
 would be *possible* for a variant of ExtractingRequest handler to use the 
 first stream to get document metadat, and have that metdata refrence the 
 other streams in some way for large chunks of text)
 
 But no one has attempted to implement that as far as i know.
 
 :
 http://localhost:8983/solr/update/extract?ext.literal.id=2ext.literal.some_code1=code1ext.literal.some_code2=code2ext.idx.attr=true\ext.def.fl=text;
 : -F myfi...@myfile.pdf
 : 
 : Where I have large numbers of ext.literal params this becomes a bit of a
 : chore.. and it would be the same case in an html form with many
 params... 
 : can I pass both files to '/update/extract' as documents, (files) linked
 : together?  Or are there any other options like this?  Perhaps something
 I
 : can do with Solrj.
 
 there's no reason those params have ot be in the URL.  you can do a 
 multipart POST with application/x-www-form-urlencoded in one part and your 
 pdf file in another part (just like doing a POST from a massive HTML form 
 with an 'input type=file' option)
 
 
 -Hoss
 
 
 

-- 
View this message in context: 
http://www.nabble.com/posting-binary-file-and-metadata-in-two-separate-documents-tp24375649p24530051.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: posting binary file and metadata in two separate documents

2009-07-16 Thread Chris Hostetter

: Subject: posting binary file and metadata in two separate documents

there was some discussion a while back about that fact that you can push 
multiple ContentStreams to SOlr in a single request, and while the 
existing handelrs all just iterate over and process them seperately, it 
would be *possible* for a variant of ExtractingRequest handler to use the 
first stream to get document metadat, and have that metdata refrence the 
other streams in some way for large chunks of text)

But no one has attempted to implement that as far as i know.

: 
http://localhost:8983/solr/update/extract?ext.literal.id=2ext.literal.some_code1=code1ext.literal.some_code2=code2ext.idx.attr=true\ext.def.fl=text;
: -F myfi...@myfile.pdf
: 
: Where I have large numbers of ext.literal params this becomes a bit of a
: chore.. and it would be the same case in an html form with many params... 
: can I pass both files to '/update/extract' as documents, (files) linked
: together?  Or are there any other options like this?  Perhaps something I
: can do with Solrj.

there's no reason those params have ot be in the URL.  you can do a 
multipart POST with application/x-www-form-urlencoded in one part and your 
pdf file in another part (just like doing a POST from a massive HTML form 
with an 'input type=file' option)


-Hoss



Re: posting binary file and metadata in two separate documents

2009-07-10 Thread rossputin

Hi.

Apologies for bumping this one, but another question occurred to me... is
there a limit to the number of ext.literal components I can put in my curl
command... if so, i will definitely need to find another way to get this
data in, as I am building up relationships between documents, and there will
be many of them.

Thanks in advance for your help,

regards,

Ross



rossputin wrote:
 
 Hi.
 
 I am currently using Solr Cell to extract content from binary files, and I
 am passing along some additional metadata with ext.literal params. Sample
 below:
 
 curl
 http://localhost:8983/solr/update/extract?ext.literal.id=2ext.literal.some_code1=code1ext.literal.some_code2=code2ext.idx.attr=true\ext.def.fl=text;
 -F myfi...@myfile.pdf
 
 Where I have large numbers of ext.literal params this becomes a bit of a
 chore.. and it would be the same case in an html form with many params... 
 can I pass both files to '/update/extract' as documents, (files) linked
 together?  Or are there any other options like this?  Perhaps something I
 can do with Solrj.
 
 Thanks in advance for your help,
 
 regards,
 
 Ross.
 
 
 

-- 
View this message in context: 
http://www.nabble.com/posting-binary-file-and-metadata-in-two-separate-documents-tp24375649p24423267.html
Sent from the Solr - User mailing list archive at Nabble.com.



posting binary file and metadata in two separate documents

2009-07-07 Thread rossputin

Hi.

I am currently using Solr Cell to extract content from binary files, and I
am passing along some additional metadata with ext.literal params. Sample
below:

curl
http://localhost:8983/solr/update/extract?ext.literal.id=2ext.literal.some_code1=code1ext.literal.some_code2=code2ext.idx.attr=true\ext.def.fl=text;
-F myfi...@myfile.pdf

Where I have large numbers of ext.literal params this becomes a bit of a
chore.. and it would be the same case in an html form with many params... 
can I pass both files to '/update/extract' as documents, (files) linked
together?  Or are there any other options like this?  Perhaps something I
can do with Solrj.

Thanks in advance for your help,

regards,

Ross.


-- 
View this message in context: 
http://www.nabble.com/posting-binary-file-and-metadata-in-two-separate-documents-tp24375649p24375649.html
Sent from the Solr - User mailing list archive at Nabble.com.