/
/entity
/entity
/document
/dataConfig
So it's something related to BinFileDataSource and TikaEntityProcessor.
Thanks,
Gary.
On 26/02/2015 14:24, Gary Taylor wrote:
Alex,
That's great. Thanks for the pointers. I'll try and get more info on
this and file a JIRA issue.
Kind
Alex,
Same results on recursive=true / recursive=false.
I also tried importing plain text files instead of epub (still using
TikeEntityProcessor though) and get exactly the same result - ie. all
files fetched, but only one document indexed in Solr.
With verbose output, I get a row for each
Alex,
That's great. Thanks for the pointers. I'll try and get more info on
this and file a JIRA issue.
Kind regards,
Gary.
On 26/02/2015 14:16, Alexandre Rafalovitch wrote:
On 26 February 2015 at 08:32, Gary Taylor g...@inovem.com wrote:
Alex,
Same results on recursive=true / recursive
.
Thanks for any assistance / pointers.
Regards,
Gary
--
Gary Taylor | www.inovem.com | www.kahootz.com
INOVEM Ltd is registered in England and Wales No 4228932
Registered Office 1, Weston Court, Weston, Berkshire. RG20 8JE
kahootz.com is a trading name of INOVEM Ltd.
, URPs and even a newsletter:
http://www.solr-start.com/
On 25 February 2015 at 11:14, Gary Taylor g...@inovem.com wrote:
I can't get the FileListEntityProcessor and TikeEntityProcessor to correctly
add a Solr document for each epub file in my local directory.
I've just downloaded Solr 5.0.0
Naveen,
Not sure our requirement matches yours, but one of the things we index
is a comment item that can have one or more files attached to it. To
index the whole thing as a single Solr document we create a zipfile
containing a file with the comment details in it and any additional
Naveen,
For indexing Zip files with Tika, take a look at the following thread :
http://lucene.472066.n3.nabble.com/Extracting-contents-of-zipped-files-with-Tika-and-Solr-1-4-1-td2327933.html
I got it to work with the 3.1 source and a couple of patches.
Hope this helps.
Regards,
Gary.
On
Jayendra,
I cleared out my local repository, and replayed all of my steps from
Friday and it now it works. The only difference (or the only one that's
obvious to me) was that I applied the patch before doing a full
compile/test/dist. But I assumed that given I was seeing my new log
entries
grateful.
Thanks and kind regards,
Gary.
On 11/04/2011 11:12, Gary Taylor wrote:
Jayendra,
Thanks for the info - been keeping an eye on this list in case this
topic cropped up again. It's currently a background task for me, so
I'll try and take a look at the patches and re-test soon.
Joey
message,
some people have been able to get this functionality to work as desired.
--
Gary Taylor
INOVEM
Tel +44 (0)1488 648 480
Fax +44 (0)7092 115 933
gary.tay...@inovem.com
www.inovem.com
INOVEM Ltd is registered in England and Wales No 4228932
Registered Office 1, Weston Court, Weston
As an example, I run this in the same directory as the msword1.doc file:
curl
http://localhost:8983/solr/core0/update/extract?literal.docid=74literal.type=5;
-F file=@msword1.doc
The type literal is just part of my schema.
Gary.
On 03/03/2011 11:45, Ken Foskey wrote:
On Thu, 2011-03-03
Can anyone shed any light on this, and whether it could be a config
issue? I'm now using the latest SVN trunk, which includes the Tika 0.8
jars.
When I send a ZIP file (containing two txt files, doc1.txt and doc2.txt)
to the ExtractingRequestHandler, I get the following log entry
(formatted
Hi,
I posted a question in November last year about indexing content from
multiple binary files into a single Solr document and Jayendra responded
with a simple solution to zip them up and send that single file to Solr.
I understand that the Tika 0.4 JARs supplied with Solr 1.4.1 don't
Thanks Erlend.
Not used SVN before, but have managed to download and build latest trunk
code.
Now I'm getting an error when trying to access the admin page (via
Jetty) because I specify HTMLStripStandardTokenizerFactory in my
schema.xml, but this appears to be no-longer supplied as part of
the filenames and
contents. Should I be able to index the contents of files stored in a
zip by using extract ?
Thanks and kind regards,
Gary.
On 25/01/2011 15:32, Gary Taylor wrote:
Thanks Erlend.
Not used SVN before, but have managed to download and build latest
trunk code.
Now I'm getting
Hi,
We're trying to use Solr to replace a custom Lucene server. One
requirement we have is to be able to index the content of multiple
binary files into a single Solr document. For example, a uniquely named
object in our app can have multiple attached-files (eg. Word, PDF etc.),
and we
to the ExtractingRequestHandler for
indexing and included as a part of single Solr document.
Regards,
Jayendra
On Wed, Nov 17, 2010 at 6:27 AM, Gary Taylor lt;g...@inovem.comgt; wrote:
gt; Hi,
gt;
gt; We're trying to use Solr to replace a custom Lucene server. One
gt; requirement we have
17 matches
Mail list logo