Re: TikaEntityProcessor on Solr 1.4?

2010-06-08 Thread Sixten Otto
2010/5/22 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@gmail.com:
 just copy the dih-extras jar file from the nightly should be fine

Now that I've finally got a server on which to attempt to set these
things up... this turns out not to be a viable solution. The extras
jar does contain the TikaEntityProcessor class, but NOT the
BinFileDataSource and BinURLDataSource on which it depends. I tried
both replacing the 1.4 DIH jar with the one from the trunk, and adding
those two specific classes to the extras jar, neither of which worked.
(And I apologize, but I didn't copy down the exceptions involved; if I
can find some free time, I might go back and make the attempt again, a
bit more methodically.)

Sixten


RE: TikaEntityProcessor on Solr 1.4?

2010-06-08 Thread Tim Gilbert
When I wanted to add some content to the solrj wiki for glassfish, I had a 
problem in that their anti-spam measures broke the ability to create a new 
account.  Someone here (Chris I think) was kind enough to create a ticket in 
the correct place:

https://issues.apache.org/jira/browse/INFRA-2726

You can see it was very quickly solved.  I am not suggesting that the problem 
is the same, only that this may be the correct place to create a new ticket 
with the problem of getting a file from the wiki and perhaps someone can help 
you there.

Tim

-Original Message-
From: Sixten Otto [mailto:six...@sfko.com] 
Sent: Tuesday, June 08, 2010 3:53 PM
To: solr-user@lucene.apache.org
Subject: Re: TikaEntityProcessor on Solr 1.4?

2010/5/22 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@gmail.com:
 just copy the dih-extras jar file from the nightly should be fine

Now that I've finally got a server on which to attempt to set these
things up... this turns out not to be a viable solution. The extras
jar does contain the TikaEntityProcessor class, but NOT the
BinFileDataSource and BinURLDataSource on which it depends. I tried
both replacing the 1.4 DIH jar with the one from the trunk, and adding
those two specific classes to the extras jar, neither of which worked.
(And I apologize, but I didn't copy down the exceptions involved; if I
can find some free time, I might go back and make the attempt again, a
bit more methodically.)

Sixten


Re: TikaEntityProcessor on Solr 1.4?

2010-05-22 Thread Noble Paul നോബിള്‍ नोब्ळ्
just copy the dih-extras jar file from the nightly should be fine

On Sat, May 22, 2010 at 3:12 AM, Sixten Otto six...@sfko.com wrote:
 On Fri, May 21, 2010 at 5:30 PM, Chris Harris rygu...@gmail.com wrote:
 Actually, rather than cherry-pick just the changes from SOLR-1358 and
 SOLR-1583 what I did was to merge in all DataImportHandler-related
 changes from between the 1.4 release up through Solr trunk r890679
 (inclusive). I'm not sure if that's what would work best for you, but
 it's one option.

 I'd rather, of course, not to have to build my own. But if I'm going
 to dabble in the source at all, it's just a slippery slope from the
 former to the latter. :-)  (My main hesitation in doing so would be
 that I'm new enough to the code that I have no idea what core changes
 the trunk's DIH might also depend on. And my Java's pretty rusty.)

 How did you arrive at your patch? Just grafting the entire
 trunk/solr/contrib/dataimporthandler onto 1.4's code? Or did you go
 through Jira/SVN looking for applicable changesets?

 I'll be very interested to hear how your testing goes!

 Sixten




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: TikaEntityProcessor on Solr 1.4?

2010-05-21 Thread Sixten Otto
2010/5/19 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@gmail.com:
 I guess it should work because Tika Entityprocessor does not use any
 new 1.4 APIs

 On Wed, May 19, 2010 at 1:17 AM, Sixten Otto six...@sfko.com wrote:
 The TikaEntityProcessor class that enables DataImportHandler to
 process business documents was added after the release of Solr 1.4,
 ... Has anyone tried back-porting those changes to Solr 1.4?

Did you mean new 1.5 APIs (since TEP was added *after* 1.4 was
released)? Even then, that doesn't make a lot of sense to me, as at
least a couple of new things (the binary data sources) *were* added to
support TikaEntityProcessor.

I'm sorry if I'm being dense, but I'm having trouble understanding this answer.

Sixten


Re: TikaEntityProcessor on Solr 1.4?

2010-05-21 Thread Chris Harris
You are right that TikaEntityProcessor has a couple of other prereqs
beyond stock Solr 1.4. I think the main point is that they're
relatively minor. I've merged TikaEntityProcessor (and some prereqs)
and its dependencies into my Solr 1.4 tree and it compiles fine,
though I haven't yet tested that TikaEntityProcessor actually works in
my setup.

Actually, rather than cherry-pick just the changes from SOLR-1358 and
SOLR-1583 what I did was to merge in all DataImportHandler-related
changes from between the 1.4 release up through Solr trunk r890679
(inclusive). I'm not sure if that's what would work best for you, but
it's one option.

On Fri, May 21, 2010 at 1:28 PM, Sixten Otto six...@sfko.com wrote:
 2010/5/19 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@gmail.com:
 I guess it should work because Tika Entityprocessor does not use any
 new 1.4 APIs

 On Wed, May 19, 2010 at 1:17 AM, Sixten Otto six...@sfko.com wrote:
 The TikaEntityProcessor class that enables DataImportHandler to
 process business documents was added after the release of Solr 1.4,
 ... Has anyone tried back-porting those changes to Solr 1.4?

 Did you mean new 1.5 APIs (since TEP was added *after* 1.4 was
 released)? Even then, that doesn't make a lot of sense to me, as at
 least a couple of new things (the binary data sources) *were* added to
 support TikaEntityProcessor.

 I'm sorry if I'm being dense, but I'm having trouble understanding this 
 answer.

 Sixten



Re: TikaEntityProcessor on Solr 1.4?

2010-05-21 Thread Sixten Otto
On Fri, May 21, 2010 at 5:30 PM, Chris Harris rygu...@gmail.com wrote:
 Actually, rather than cherry-pick just the changes from SOLR-1358 and
 SOLR-1583 what I did was to merge in all DataImportHandler-related
 changes from between the 1.4 release up through Solr trunk r890679
 (inclusive). I'm not sure if that's what would work best for you, but
 it's one option.

I'd rather, of course, not to have to build my own. But if I'm going
to dabble in the source at all, it's just a slippery slope from the
former to the latter. :-)  (My main hesitation in doing so would be
that I'm new enough to the code that I have no idea what core changes
the trunk's DIH might also depend on. And my Java's pretty rusty.)

How did you arrive at your patch? Just grafting the entire
trunk/solr/contrib/dataimporthandler onto 1.4's code? Or did you go
through Jira/SVN looking for applicable changesets?

I'll be very interested to hear how your testing goes!

Sixten


Re: TikaEntityProcessor on Solr 1.4?

2010-05-19 Thread Noble Paul നോബിള്‍ नोब्ळ्
I guess it should work because Tika Entityprocessor does not use any
new 1.4 APIs

On Wed, May 19, 2010 at 1:17 AM, Sixten Otto six...@sfko.com wrote:
 Sorry to repeat this question, but I realized that it probably
 belonged in its own thread:

 The TikaEntityProcessor class that enables DataImportHandler to
 process business documents was added after the release of Solr 1.4,
 along with some other changes (like the binary DataSources) to support
 it. Obviously, there hasn't been an official release of Solr since
 then. Has anyone tried back-porting those changes to Solr 1.4?

 (I do see that the question was asked last month, without any
 response: http://www.lucidimagination.com/search/document/5d2d25bc57c370e9)

 The patches for these issues don't seem all that complex or pervasive,
 but it's hard for me (as a Solr n00b) to tell whether this is really
 all that's involved:
 https://issues.apache.org/jira/browse/SOLR-1583
 https://issues.apache.org/jira/browse/SOLR-1358

 Sixten




-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


TikaEntityProcessor on Solr 1.4?

2010-05-18 Thread Sixten Otto
Sorry to repeat this question, but I realized that it probably
belonged in its own thread:

The TikaEntityProcessor class that enables DataImportHandler to
process business documents was added after the release of Solr 1.4,
along with some other changes (like the binary DataSources) to support
it. Obviously, there hasn't been an official release of Solr since
then. Has anyone tried back-porting those changes to Solr 1.4?

(I do see that the question was asked last month, without any
response: http://www.lucidimagination.com/search/document/5d2d25bc57c370e9)

The patches for these issues don't seem all that complex or pervasive,
but it's hard for me (as a Solr n00b) to tell whether this is really
all that's involved:
https://issues.apache.org/jira/browse/SOLR-1583
https://issues.apache.org/jira/browse/SOLR-1358

Sixten