I wasn't advocating this. ' (and you may not see much if any)'.
Comparisons of managed languages vs C++ seem to have widely varied results. Some claim the managed language is faster, some that it is slower. The simple tests I've done with C# (which is sort of like java but faster ... [no flames please. I don't really care if this statement is true or not!]) make me think C++ is 1.5 to 2 times faster for array intensive work - mainly because it checks the bounds a lot. And I would guess that some of Nutch falls into this category, but by no means all. Personally, I would guess that you could get some 10-20 percent higher throughput if Nutch and Lucene were all native C++. But then you would have taken twice as long to write the code. And I find writing in managed languages (Java, .net) so much less frustrating and so much more productive, that any small performance gains are irrelevant! Iain -----Original Message----- From: reinhard schwab [mailto:[email protected]] Sent: 04 August 2009 18:36 To: [email protected] Subject: Re: Nutch in C++ Iain Downs schrieb: > I think there is probably a sub text here (I'm putting words in Otis' mouth, > for which my apologies). > > ' Yes, you could rewrite Nutch in C++ and have that use CLucene.' But you'd > be mad to do so! > > I'm a bit out of date with Nutch, but it's large. And Java to C++ is not an > easy conversion because of the different memory management systems. > > And why? I guess you may see some performance improvement, but it would be > a LOT cheaper to throw hardware at the problem (and you may not see much if > any). > performance improvement? can you proove that c++ will be faster? > So if you have a few months to spare .... > > > Iain > > -----Original Message----- > From: Otis Gospodnetic [mailto:[email protected]] > Sent: 04 August 2009 04:49 > To: [email protected] > Subject: Re: Nutch in C++ > > CLucene is just like Lucene (except a few versions behind), but written in > C++. > > Yes, you could rewrite Nutch in C++ and have that use CLucene. > > Otis > -- > Sematext is hiring -- http://sematext.com/about/jobs.html?mls > Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR > > > > ----- Original Message ---- > >> From: "[email protected]" <[email protected]> >> To: [email protected] >> Sent: Monday, August 3, 2009 2:29:40 PM >> Subject: Re: Nutch in C++ >> >> >> >> >> >> Hi, >> >> I know nutch uses Lucene. But for what is Clucene then? Only for indexing >> > files > >> in a hard drive? >> >> >> I have knowledge of C++ and some experience. I wanted to code crawler of >> > Nutch > >> in C++ to get more experience and make it open source, only if it l be >> > useful > >> for the open source community. >> My goal is to get more experience in C++ and make? contribution to open >> > source. > >> If you know other projects that may be more useful, please let me know. >> >> thanks. >> Alex. >> >> >> -----Original Message----- >> From: Otis Gospodnetic >> To: [email protected] >> Sent: Sun, Aug 2, 2009 8:15 pm >> Subject: Re: Nutch in C++ >> >> >> >> >> >> >> >> >> >> >> Nutch uses Lucene (Java), not CLucene (C++). >> >> Why are you looking to rewrite Nutch in C++ anyway? Sounds scary. >> >> Otis >> -- >> Sematext is hiring -- http://sematext.com/about/jobs.html?mls >> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR >> >> >> >> ----- Original Message ---- >> >>> From: "[email protected]" >>> To: [email protected] >>> Sent: Thursday, July 30, 2009 3:13:16 PM >>> Subject: Nutch in C++ >>> >>> Hi, >>> >>> As I understood only indexing part of nutch is in C++ as clucene.? I >>> > want to > >>> code? nutch in C++, only in case if it is worth doing that.? I wondered >>> > if is > >>> worth coding the remaining parts of nutch in C++, let say the crawler. >>> > Can > >>> someone give me directions on what to start. >>> >>> Thanks >>> Alex. >>> > > >
