Itamar, Thanks for putting this together.
The demo made me realize something about the design of Analyzer that I didn't realize before. The abstract Analyzer class was designed to be used with Java's anonymous class functionality in mind. This makes creating custom Analyzers more concise in Java than it is in .NET. In .NET we don't have anonymous classes. But we DO have anonymous methods that we could use to simulate this behavior, provided there is a helper class to assist with it. To demonstrate what I mean, I have updated the demo with a (very simple) AnonymousAnalyzer, which completely eliminates the need for the 3 analyzer classes that you made. https://github.com/NightOwl888/LuceneNetDemo/blob/master/LuceneNetDemo/GitHubIndex.cs I am not suggesting we should update the demo like this, but I am suggesting that we should add something like AnonymousAnalyzer (perhaps renamed to CustomAnalyzer, InlineAnalyzer, DelegateAnalyzer, or something else more appropriate) in the box so .NET developers can take advantage of its language features in conjunction with Lucene the same way that Java developers do. In fact, I think there are many things we can add (such as utility classes, utility methods, extension methods, and builders) that would make developing with Lucene almost as seamless in .NET as it is in Java - we just need to put our thinking caps on. For example, maybe there could be a fluent TokenStreamComponentsBuilder that could be used to put the components together in a fluent way...? Another thing I noticed is that we should probably move the TokenStreamComponents class so it is not a nested class of Analyzer to match the syntax more closely to Lucene. A few thoughts on the demo: 1. Not everyone is familiar with a GitHub organization. Perhaps the demo should provide a list to choose from? Currently, if you type something that doesn't exist you get an exception. I had to do a Google search to come up with something, since my own username didn't work. One of the top results (before an actual list of organizations) was an API that can be utilized to read all of the GitHub organizations: https://developer.github.com/v3/orgs/ 2. Maybe there should be some kind of estimate given on how long it will take to index the organization. When I ultimately chose "apache" it took several minutes to index the results, which I was not expecting. 3. Perhaps the API key should be put into a separate (config) file rather than inline in the code. And you could pre-define the name of this file and put it into a .gitignore file. This would help prevent anyone from accidentally committing their API key to the Git repo. 4. The search results seemed a bit underwhelming. Maybe there should be some kind of indicators how many results Lucene.Net had to sift through to come up with the short list. Or at least there should be some kind of explanation what is happening to put things into perspective. Think of a crime scene investigation. If the investigators enter the search criteria and it comes up with 50,000 suspects it would ruin their day. If it comes up with 3, then their work is much easier. But without some kind of indicator showing that 3 is better than 50,000, the latter seems much more impressive in a demo. 5. Perhaps there should be some way to reset the index? I entered another organization to test my updates to the code and it added that organization's results to the original index, which I wasn't expecting. Thanks, Shad Storhaug (NightOwl888) -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Itamar Syn-Hershko Sent: Wednesday, November 9, 2016 6:45 AM To: [email protected]; [email protected] Subject: Lucene.NET 4.8 demo Hey folks, I just pushed a working demo for Lucene.NET 4.8 using the latest bits to index and search public repositories on github. Check it out: https://github.com/synhershko/LuceneNetDemo I also recorded a Channel 9 video walking through the demo - I will post it here again as soon as it's released on the nets. This should clarify some mysteries around the new-ish API and hopefully drive confidence in what we consider a stable beta release. Cheers, -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Lucene.NET committer and PMC member
