Problem with pdf, upgrading Cell

2010-04-23 Thread Marc Ghorayeb
, but it doesn't extract anything, only the literal values that i pass on are indexed. Any help would be greatly appreciated!! :) Thank you. Marc Ghorayeb _ Hotmail arrive sur votre téléphone ! Compatible Iphone

RE: Problem with pdf, upgrading Cell

2010-04-23 Thread Marc Ghorayeb
From: Marc Ghorayeb dekay...@hotmail.com To: solr-user@lucene.apache.org Sent: Fri, April 23, 2010 8:42:53 AM Subject: Problem with pdf, upgrading Cell Hello, I configured a Solr server to be able to extract data from various documents, including pdfs. Unfortunately, the data

RE: Problem with pdf, upgrading Cell

2010-04-23 Thread Marc Ghorayeb
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Marc Ghorayeb dekay...@hotmail.com To: solr-user@lucene.apache.org Sent: Fri, April 23, 2010 8:42:53 AM Subject: Problem with pdf

RE: Problem with pdf, upgrading Cell

2010-04-23 Thread Marc Ghorayeb
Seems like i'm not the only one with this no extraction problem:http://www.mail-archive.com/solr-user@lucene.apache.org/msg33609.htmlApparently he tried the same thing, building from the trunk, and indexing a pdf, and no extraction occured... Strange. Marc G.

RE: Problem with pdf, upgrading Cell

2010-04-23 Thread Marc Ghorayeb
. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Marc Ghorayeb dekay...@hotmail.com To: solr-user@lucene.apache.org Sent: Fri, April 23, 2010 9:12:39 AM Subject: RE: Problem

RE: Problem with pdf, upgrading Cell

2010-04-26 Thread Marc Ghorayeb
Okay i've been digging a little bit through the Java code from the SVN, and it seems the load function inside the ExtractingDocumentLoader class does not receive the ContentStream (it is set to null...).Maybe i should send this to the developper mailing list? Marc From: dekay...@hotmail.com

Re: Problem with pdf, upgrading Cell

2010-04-30 Thread Marc Ghorayeb
Hi Nope i didn't get it to work... Just like you, command line version of tika extracts correctly the content, but once included in Solr, no content is extracted. What i tried until now is:- Updating the tika libraries inside Solr 1.4 public version, no luck there.- Downloading the latest SVN

RE: Problem with pdf, upgrading Cell

2010-05-03 Thread Marc Ghorayeb
...@apache.orgwrote: Praveen and Marc, Can you share the PDF (feel free to email my private email) that fails in Solr? Thanks, Grant On Apr 30, 2010, at 7:55 AM, Marc Ghorayeb wrote: Hi Nope i didn't get it to work... Just like you, command line version of tika extracts

RE: Problem with pdf, upgrading Cell

2010-05-04 Thread Marc Ghorayeb
which parser to use, despite the fact that it properly identifies the MIME Type. -Grant On May 3, 2010, at 5:36 PM, Grant Ingersoll wrote: I'm investigating. On May 3, 2010, at 5:17 AM, Marc Ghorayeb wrote: Hi, Grant, i confirm what Praveen has

RE: Problem with pdf, upgrading Cell

2010-05-04 Thread Marc Ghorayeb
Hey, I got it to work. I just redid my steps, i had forgotten several libraries that were imported through the xml. PDF extraction seems to work once again, i have yet to find one that raises an exception! Thanks for the investigation, at least we now have a fix :) Marc

RE: Problem with pdf, upgrading Cell

2010-05-04 Thread Marc Ghorayeb
of content, it shows that line as title too and mine one. 'title' field is defined as multivalue in schema. Any idea, whats going on? or am i missing something? On Tue, May 4, 2010 at 4:13 PM, Marc Ghorayeb dekay...@hotmail.com wrote: Hey, I got it to work. I just redid

RE: Problem with pdf, upgrading Cell

2010-05-05 Thread Marc Ghorayeb
it extracted only one line of content, it shows that line as title too and mine one. 'title' field is defined as multivalue in schema. Any idea, whats going on? or am i missing something? On Tue, May 4, 2010 at 4:13 PM, Marc Ghorayeb dekay...@hotmail.com

RE: Problem with pdf, upgrading Cell

2010-05-05 Thread Marc Ghorayeb
Praveen, I am indeed using a trunk version from last week's svn i think. You could always try a version from the hudson builds. I did not try this procedure with Solr's 1.4 release though. Marc

RE: Problem with pdf, upgrading Cell

2010-05-11 Thread Marc Ghorayeb
Great news, thanks :) Marc _ Vous voulez regarder la TV directement depuis votre PC ? C'est très simple avec Windows 7 http://clk.atdmt.com/FRM/go/229960614/direct/01/

Copyfield multi valued to single value

2010-06-09 Thread Marc Ghorayeb
Hello, Is there a way to copy a multivalued field to a single value by taking for example the first index of the multivalued field? I am actually trying to sort my index by Title and my index contains Tika extracted titles which come in as multi valued hence why my title field is multi valued.

RE: Copyfield multi valued to single value

2010-06-15 Thread Marc Ghorayeb
Thanks for the update, i'll have to find another way then :s. Marc Date: Mon, 14 Jun 2010 13:44:30 -0700 From: hossman_luc...@fucit.org To: solr-user@lucene.apache.org Subject: Re: Copyfield multi valued to single value : Is there a way to copy a multivalued field to a single value by

Strange query behavior

2010-06-28 Thread Marc Ghorayeb
Hello, I have a title that says 3DVIA Studio amp; Virtools Maya and 3dsMax Exporters. The analysis tool for this field gives me these tokens:3dviadviastudio;virtoolmaya3dsmaxdssystèmmaxexport However, when i search for 3dsmax, i get no results :( Furthermore, if i search for dsmax i get the

Spellcheck help

2010-07-08 Thread Marc Ghorayeb
Hello,I've been trying to get rid of a bug when using the spellcheck but so far with no success :(When searching for a word that starts with a number, for example 3dsmax, i get the results that i want, BUT the spellcheck says it is not correctly spelled AND the collation gives me 33dsmax.

RE: Spellcheck help

2010-07-27 Thread Marc Ghorayeb
To: solr-user@lucene.apache.org Subject: Re: Spellcheck help Can anybody help me with this? :( -Original Message- From: Marc Ghorayeb Sent: Thursday, July 08, 2010 9:46 AM To: solr-user@lucene.apache.org Subject: Spellcheck help Hello,I've been trying to get rid of a bug when