Lucene came out on top over native code search solutions in this particular benchmark, for instance: http://zooie.wordpress.com/2009/07/06/a-comparison-of-open-source-search-engines-and-indexing-twitter/ But that's just one test and one could quibble with how the tests were run.
If you're interested in Lucene, there is a native port in the works: http://lucene.apache.org/lucy/ I think the answer to your question is 'yes' in general, since the libraries are reasonably extensible, and Java allows native code invocation through JNI. What in particular are you considering? "Lucene" covers a lot of ground. Very broadly speaking, with proper care and feeding and decent code, and a modern JVM, the native/Java performance gap is not significant. I would not begin with an assumption that native code is a must. I might suggest you try Lucene/Mahout. It may surprise you with performance. If not, ask the list for pointers -- these things inevitably need tuning to run optimally. *Then* think about writing a native code solution. Sean On Sat, Aug 22, 2009 at 7:50 PM, Tim Hughes<[email protected]> wrote: > > I'm working on a project which is considering the Apache Lucine/SOLR/Mahout > tech stack for a data mining & machine learning project. > > The issue of Java algorithm performance vs C/C++ has come up, and I would > like to know if it is possible to create custom algorithms in C/C++ and use > them within the Mahout framework. I have been unable to find information on > this. > -- > View this message in context: > http://www.nabble.com/Custom-Algorithm-%28C-C%2B%2B%29---tp25096676p25096676.html > Sent from the Mahout User List mailing list archive at Nabble.com. > >
