So, am I asking too much (maybe), is this forum dead (then where to ask ? there is extreme noise here), is lucene perfect(of course not) ?
On Wed, Jan 25, 2017 at 5:01 PM, Dorian Hoxha <dorian.ho...@gmail.com> wrote: > Was thinking also how bing doesn't use posting lists > <http://bitfunnel.org/strangeloop/> and also compiling queries > <https://github.com/BitFunnel/NativeJIT> ! > About the queries, I would've think it wouldn't be as high overhead as > queries in in rdbms since those apply on each row while on search they > apply on each bitset. > > > On Mon, Jan 23, 2017 at 6:04 PM, Jeff Wartes <jwar...@whitepages.com> > wrote: > >> >> >> I’ve had some curiosity about this question too. >> >> >> >> For a while, I watched for a seastar-like library for the JVM, but >> https://github.com/bestwpw/windmill was the only one I came across, and >> it doesn’t seem to be going anywhere. Since one of the points of the JVM is >> to abstract away the platform, I certainty wonder if the JVM will ever get >> the kinds of machine affinity these other projects see. Your >> one-shard-per-core could probably be faked with multiple JVMs and numactl - >> could be an interesting experiment. >> >> >> >> That said, I’m aware that a phenomenal amount of optimization effort has >> gone into Lucene, and I’d also be interested in hearing about things that >> worked well. >> >> >> >> >> >> *From: *Dorian Hoxha <dorian.ho...@gmail.com> >> *Reply-To: *"dev@lucene.apache.org" <dev@lucene.apache.org> >> *Date: *Friday, January 20, 2017 at 8:12 AM >> *To: *"dev@lucene.apache.org" <dev@lucene.apache.org> >> *Subject: *How would you architect solr/lucene if you were starting from >> scratch for them to be 10X+ faster/efficient ? >> >> >> >> Hi friends, >> >> I was thinking how scylladb architecture >> <http://www.scylladb.com/technology/architecture/> works compared to >> cassandra which gives them 10x+ performance and lower latency. If you were >> starting lucene and solr from scratch what would you do to achieve >> something similar ? >> >> Different language (rust/c++?) for better SIMD >> <http://blog-archive.griddynamics.com/2015/06/lucene-simd-codec-benchmark-and-future.html> >> ? >> >> Use a GPU with a SSD for posting-list intersection ?(not out yet) >> >> Make it in-memory and use better data structures? >> >> Shard on cores like scylladb (so 1 shard for each core on the machine) ? >> >> External cache (like keeping n redis-servers with big ram/network & slow >> cpu/disk just for cache) ?? >> >> Use better data structures (like algolia autocomplete radix >> <https://blog.algolia.com/inside-the-algolia-engine-part-2-the-indexing-challenge-of-instant-search/> >> ) >> >> Distributing documents by term instead of id >> <http://research.microsoft.com/en-us/um/people/trishulc/papers/Maguro.pdf> >> ? >> >> Using ASIC / FPGA ? >> >> >> >> Regards, >> >> Dorian >> > >