Itamar,

Thanks for putting this together.

The demo made me realize something about the design of Analyzer that I didn't 
realize before. The abstract Analyzer class was designed to be used with Java's 
anonymous class functionality in mind. This makes creating custom Analyzers 
more concise in Java than it is in .NET.

In .NET we don't have anonymous classes. But we DO have anonymous methods that 
we could use to simulate this behavior, provided there is a helper class to 
assist with it. To demonstrate what I mean, I have updated the demo with a 
(very simple) AnonymousAnalyzer, which completely eliminates the need for the 3 
analyzer classes that you made. 
https://github.com/NightOwl888/LuceneNetDemo/blob/master/LuceneNetDemo/GitHubIndex.cs

I am not suggesting we should update the demo like this, but I am suggesting 
that we should add something like AnonymousAnalyzer (perhaps renamed to 
CustomAnalyzer, InlineAnalyzer, DelegateAnalyzer, or something else more 
appropriate) in the box so .NET developers can take advantage of its language 
features in conjunction with Lucene the same way that Java developers do. In 
fact, I think there are many things we can add (such as utility classes, 
utility methods, extension methods, and builders) that would make developing 
with Lucene almost as seamless in .NET as it is in Java - we just need to put 
our thinking caps on.

For example, maybe there could be a fluent TokenStreamComponentsBuilder that 
could be used to put the components together in a fluent way...?

Another thing I noticed is that we should probably move the 
TokenStreamComponents class so it is not a nested class of Analyzer to match 
the syntax more closely to Lucene.
    

A few thoughts on the demo:

1. Not everyone is familiar with a GitHub organization. Perhaps the demo should 
provide a list to choose from? Currently, if you type something that doesn't 
exist you get an exception. I had to do a Google search to come up with 
something, since my own username didn't work. One of the top results (before an 
actual list of organizations) was an API that can be utilized to read all of 
the GitHub organizations: https://developer.github.com/v3/orgs/
2. Maybe there should be some kind of estimate given on how long it will take 
to index the organization. When I ultimately chose "apache" it took several 
minutes to index the results, which I was not expecting.
3. Perhaps the API key should be put into a separate (config) file rather than 
inline in the code. And you could pre-define the name of this file and put it 
into a .gitignore file. This would help prevent anyone from accidentally 
committing their API key to the Git repo.
4. The search results seemed a bit underwhelming. Maybe there should be some 
kind of indicators how many results Lucene.Net had to sift through to come up 
with the short list. Or at least there should be some kind of explanation what 
is happening to put things into perspective. Think of a crime scene 
investigation. If the investigators enter the search criteria and it comes up 
with 50,000 suspects it would ruin their day. If it comes up with 3, then their 
work is much easier. But without some kind of indicator showing that 3 is 
better than 50,000, the latter seems much more impressive in a demo.
5. Perhaps there should be some way to reset the index? I entered another 
organization to test my updates to the code and it added that organization's 
results to the original index, which I wasn't expecting.


Thanks,
Shad Storhaug (NightOwl888)


-----Original Message-----
From: [email protected] [mailto:[email protected]] On 
Behalf Of Itamar Syn-Hershko
Sent: Wednesday, November 9, 2016 6:45 AM
To: [email protected]; [email protected]
Subject: Lucene.NET 4.8 demo

Hey folks,

I just pushed a working demo for Lucene.NET 4.8 using the latest bits to index 
and search public repositories on github. Check it out:
https://github.com/synhershko/LuceneNetDemo

I also recorded a Channel 9 video walking through the demo - I will post it 
here again as soon as it's released on the nets.

This should clarify some mysteries around the new-ish API and hopefully drive 
confidence in what we consider a stable beta release.

Cheers,

--

Itamar Syn-Hershko
http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance 
Developer & Consultant Lucene.NET committer and PMC member

Reply via email to