Hi Emre,

This is great thank you for the contribution and of course the
tutorial which I managed to read yesterday.

Have a great weekend

Lewis

On Fri, Jun 8, 2012 at 4:42 PM, Emre Çelikten <[email protected]> wrote:
> Hello,
>
> I have done as you asked. I hope I have done it correctly as this was my
> first patch. Here's the issue:
> https://issues.apache.org/jira/browse/NUTCH-1382
>
> Here's a tutorial for people that might be interested:
> http://cmusphinx.sourceforge.net/2012/06/building-a-java-application-with-apache-nutch-and-solr/
>
> It might have slight changes soon since I will proof-read it this evening.
>
> Hope that helps.
>
> Best,
>
> Emre
>
> On Fri, Jun 8, 2012 at 2:00 PM, Lewis John Mcgibbney <
> [email protected]> wrote:
>
>> Hi Emre,
>>
>> Even if you were to open a Jira issue for this and submit a patch of
>> your hack it would be excellent to have the code available to the
>> community.
>>
>> All the best, oh and glad you got your application working.
>> Lewis
>>
>> On Fri, Jun 8, 2012 at 4:22 AM, Emre Çelikten <[email protected]> wrote:
>> > Hello again,
>> >
>> > I managed to do it. Getting the entire thing to work was tricky. I had to
>> > resort to a hack.
>> >
>> > I will post how I managed to do it here soon, for people that might be
>> > interested in the future.
>> >
>> > Thanks again.
>> >
>> > Best,
>> >
>> > Emre
>> >
>> > On Fri, Jun 8, 2012 at 12:33 AM, Emre Çelikten <[email protected]>
>> wrote:
>> >
>> >> Hello Markus,
>> >>
>> >> Thanks very much for your help.
>> >>
>> >> I have looked at Nutch source. I think I need to make a different
>> version
>> >> of indexSolr method in SolrIndexer.java, yes? The current version is:
>> >>
>> >> public void indexSolr(String solrUrl, Path crawlDb, Path linkDb,
>> >>       List<Path> segments, boolean noCommit, boolean deleteGone, String
>> >> solrParams)
>> >>
>> >> I will try to change "String solrUrl" part to "SolrServer server" in the
>> >> new method and use my own SolrServer that was created in the
>> application.
>> >> Do you think this is a correct approach?
>> >>
>> >> Best,
>> >>
>> >> Emre
>> >>
>> >>
>> >> On Thu, Jun 7, 2012 at 11:27 PM, Markus Jelsma <
>> [email protected]
>> >> > wrote:
>> >>
>> >>> Hello!
>> >>>
>> >>> Sounds very interesting. Anyway, Solr can run embedded in a Java
>> >>> application called EmbeddedSolrServer. You do need to make some
>> changes to
>> >>> the SolrIndexer tools in Nutch.
>> >>>
>> >>> Cheers
>> >>>
>> >>> -----Original message-----
>> >>> > From:Emre Çelikten <[email protected]>
>> >>> > Sent: Thu 07-Jun-2012 22:24
>> >>> > To: [email protected]
>> >>> > Subject: Building Lucene index with Nutch 1.4
>> >>> >
>> >>> > Hello everybody,
>> >>> >
>> >>> > As part of a project, I am working on a FOSS tool that will build
>> >>> language
>> >>> > models using data obtained from the web which will then be used for
>> >>> speech
>> >>> > recognition. I plan to make this tool quite compact by encapsulating
>> as
>> >>> > much as I can in a single Java application and not requiring the
>> user to
>> >>> > install/configure tons of stuff.
>> >>> >
>> >>> > I have managed to set up Nutch and am able to crawl a website inside
>> a
>> >>> Java
>> >>> > application. The next thing I need to do is to search for certain
>> >>> keywords
>> >>> > in the obtained data. I have read that the ability to build Lucene
>> >>> indexes
>> >>> > has been removed from Nutch and we now need to use Solr instead. The
>> way
>> >>> > Solr works (servlets, HTTP) is not really appropriate for a tool that
>> >>> only
>> >>> > needs search functionality that is invisible to the user.
>> >>> >
>> >>> > What would you recommend me to do in this case? Is there absolutely
>> no
>> >>> way
>> >>> > of building Lucene indexes? I could not find anything other than
>> >>> > recommendations to use Solr instead. Should I try to use an older
>> >>> version
>> >>> > of Nutch?
>> >>> >
>> >>> > Thanks in advance,
>> >>> >
>> >>> > Emre
>> >>> >
>> >>>
>> >>
>> >>
>>
>>
>>
>> --
>> Lewis
>>



-- 
Lewis

Reply via email to