On Dec 29, 2008, at 9:50 AM, Aleksander M. Stensby wrote:
I'm not sure if this is the place to ask, so please feel free to
correct me if I'm completely wrong.
I've been using Lucene for about three years or so now, and Solr for
about 6 months or so. I really love what you guys are doing for the
community and I would like to help in whatever way I can.
I asked on the solr list about maybe it would be nice to have SolrJ
support for handling the term vector component responses in a
similar fashion to how facets are treated today, and thought that I
could help by adding this functionality. Since I'm new to
contributing (yes I have read the wiki etc, so I know the basics),
but I just wanted to ask a few things:
- Regarding naming conventions, do you always use the
_privateVariable vs. parameterVariable naming?
I would say we generally prefer the Java conventions, which I believe
means the latter: privateVariable, parameterVariable. I personally
don't like the _ prefix, but we aren't that much of sticklers over it.
- When testing, I figured that I would add my tests to the
SolrExampleTests.java class, so that I can run them against embedded
solr, jetty etc etc. Is that the correct approach?
Not sure why a unit test would need Jetty. All you need to check,
IMO, is that the new methods create valid objects given valid inputs,
and also properly handle when no term vector info is present.
Otherwise, you are testing all the serialization, etc. Personally, I
really don't like the fact that our core tests have a dependency on
the example configuration. I know the tests are useful, just don't
think they belong in the core.
SolrQueryTest may be more appropriate, but feel free to add your own
standalone test.
- Specifically for my contribution, dealing with the term vector
response handler,
The TVRH is just an example of the use of the TermVectorComponent. It
is intended that one would configure the TermVectorComponent on their
own ReqHandler, so as not to have to make an extra call to Solr.
is it okay to use something like:
query.set(CommonParams.QT, "tvrh");
in the test? or is this bad? i just figured that you dont want such
parameters forced into the SolrQuery class... or? What do you all
suggest? for highlighting and facets its straigtforward since you
don't need to set any QT, and you can just say
solrQuery.setHighlight(true); etc. But i guess it might be bad to
force the handler into the SolrQuery class, if someone decides to
not register the handler in their solrconfig.xml... Any thoughts here?
All solrQuery.setHightlight does is:
this.set(HighlightParams.HIGHLIGHT, true);
which assumes that highlighting is configured for that RequestHandler
(which it is by default)
In your case, you could have:
SolrQuery.setTermVectors(boolean)
and it should add a param for TermVectorComponent.tv = true. There is
no need to have the request handler involved.
And how "done" should my patch be before i add it to jira?
I guess that depends on whether you want it committed or not :-) I'd
recommend an iterative approach. You don't need to have it done for
the first time you put it up, but you should make sure you care and
feed it based on comments that you get back and post a new patch.
At a minimum, I would say it should compile and all tests should still
pass, but that is not a hard and fast rule as it is OK to put up
patches where you are explicitly seeking feedback on the APIs.
Since the TVC is something I wrote, I'll likely take a look at
whatever you put up and provide feedback at some point, but I'm pretty
swamped at the moment, so please don't take it wrong. Of course,
others are free to help as they see fit.
HTH,
Grant