Hi Mike. I'm writing code for the Altera OpenCL SDK. I have a code base that gives me a non-Lucene format index. I was wondering in your benchmark what kind of data do you collect? Do you collect all the position and frequency data? I'm also curious about what you see as the biggest bottleneck in creating an index? Is it creating the index from the data or merging the indexes? Or something else? Do you feel the algorithm is CPU, memory or disk bound? And finally do you think there is a market for accelerated indexing? Say I could quadruple the price performance yet still make 100% Lucene compatible indexes, would people pay for that?
Thanks Steve
