I regard the compatibility with Java Lucene - both at file and API level - as a 
key reason for usage - it's a very well established project with onward 
momentum delivering a great functionality stack. As others have said, it's a 
pretty specialist I trust the people over at Java Lucene, particularly Doug 
Cutting, to know what they're doing and keep delivering. The current perceived 
goal of Lucene.NET - to deliver a search solution that is file- and 
API-compatible with Java Lucene fits our needs.

We might speculate that a divergent .NET search solution might attract a group 
of keen developers who would make it a fantastic more .NET-friendly resource 
with improved features, but then again it might not. And if it is such an 
attractive idea to .NET Of course a .NET-friendly wrapper - as an option - 
would not be a bad thing.

In the event of serious divergence, it might be as easy for us to move the 
search functions to a Java Lucene solution and wrap it with a service-oriented 
.NET accessible wrapper. We already handle all our search functions through a 
wrapper library and run a separate search server with .NET remoting access so 
the change architecturally would not be enormous.

So what could I and other commercial users without spare dev capacity do to 
help support the developers in the existing goal of Lucene.NET?

Does money come into it? Search has quite a high commercial value, and those of 
us who are using Lucene.NET are getting a substantial degree of search 
functionality to which there is little non-commercial competition. I'm 
obviously not suggesting commercialising the project (Blackbox tried that and 
it didn't work, and in any event I believe in the open source ethos, if you can 
honestly say that and still buy Microsoft products :->), but someone commented 
that a reason for not attracting developers is the lack of sponsorship. 

Is there some way for those of us who are small-ish commercial users (so not in 
a position to contribute developer time or large-scale sponsorship) to 
contribute monetarily to the development of Lucene, and would that make a real 
difference to the developers, current or potential, in terms of how much time 
they make available to the project? If it would, you could reach out to me and 
the other commercial users on this list or elsewhere and see what kind of fund 
you could get together, or offer a donate now feature.

I agree with all those who've said that simplifying the porting process and 
reducing its dependence on George is one of the easiest options to take to get 
the project back to some kind of release schedule, and am willing to part- or 
whole-finance (depending how much it is!) any related conversion tool that 
needs buying. 


Moray
-------------------------------------
Moray McConnachie
Director of IT    +44 1865 261 600
Oxford Analytica  http://www.oxan.com

-----Original Message-----
From: Peter Miller [mailto:peter.mil...@condenast.co.uk]
Sent: 04 November 2010 23:57
To: lucene-net-user@lucene.apache.org
Subject: RE: Lucene.NET Community Status

Firstly, I am (one of many I'm sure) someone who has used Lucene.net for a few 
years now and have relied on its maturity and stability. In fact, all of our 
high traffic consumer facing websites use Lucene.net as a primary data source 
for fast local reads, not just for searching.

The cause for this discussion was lack of contribution activity, and I think 
everyone agrees that we would rather the project keeps progressing. However, 
there is a real difficulty progressing and it seems like the reason for that is 
both a lack of certainty in the project's direction and some barriers to entry 
for contribution. It therefore feels necessary to decide or reiterate as a 
community which of the following is a priority (if they are indeed at odds with 
each other), and stick to those values.

1. How much value is there staying compatible in usage with java Lucene (ie: 
files can be used by either version at the same time)?
        - People have expressed value in being able to use the tools across the 
versions. Is this important anymore?
2. How much value is there staying syntactically similar to java lucene?
        - Is it more important to utilise knowledge and documentation across 
the versions or to make a library architected in a modern .Net way for 
Lucene-style indexing?
3. What is the best environment for this project - ASF / CodePlex ?
        - Do the answers to the above values influence this decision? For 
example, if it is decided to keep as similar to the Java version perhaps it may 
make sense to stay in ASF but if the project is it's important to create a .Net 
architecture for the project, perhaps it makes more sense to go to codeplex as 
this could be more friendly for .Net architect contributers.

I do feel that Lucene.net has got to the reliable point it is at today (with 
version 2) because the leadership did uphold strong values about the purpose of 
the project and the code, and I am in awe of the committers and other 
contributors who have brought the project this far. If those values are to 
change then so be it, but we must decide on each value. Perhaps if there is a 
strong group on both sides then more than 1 project will need to exist, but I 
sincerely hope that doesn't happen since the reason for this position was a 
lack of contributions on the 1 project which exists.

Once the values are decided then as long as we can reduce the barriers to 
contribution by having a clear project vision and methodology (with minimal 
bottlenecks or single points of failure in terms of contribution) then we can 
look to a brighter future with more contributions from the very reliant 
community of developers and users.

My personal opinions:
1. There is a lot of value maintaining read+write-compatibility with Java 
Lucene, at least at this point.
2. Staying very syntactically similar to Java Lucene is how the project has 
gotten this far, and I feel it's how it should progress. Again - at this point.
3. I do feel ASF is unfamiliar to .Net OS people, especially when it comes to 
contributions - I love CodePlex and have seen some great fast-moving .Net 
projects grow on it.

However, (1) and (2) are only my opinions if we can lower the barriers to 
contribution and find an iteration process which isn't reliant on a single 
person, and isn't heavily time-consuming and laborious.

Pete Miller
Head of Technology and Development, Condé Nast Digital twitter.com/petemill

-----Original Message-----
From: Phil Haack [mailto:phi...@microsoft.com]
Sent: 04 November 2010 16:35
To: lucene-net-user@lucene.apache.org
Subject: RE: Lucene.NET Community Status

I think Josh was making the point (and correct me if I'm wrong Josh), that you 
can do #1 and #2 without being in the ASF.

There's being "close to Lucene" in the sense of staying close to the 
line-by-line port (which I'm currently in favor of given everything I've heard 
in this discussion) and there's being "close to lucene" in the sense of staying 
within the ASF and being a part of the Lucene project.

Per what I said in another thread, I'm not advocating moving away from the ASF, 
but just saying it's an idea worth exploring in combination with staying close 
to Lucene implementation-wise.

Phil

-----Original Message-----
From: Arne Claassen [mailto:ar...@mindtouch.com]
Sent: Thursday, November 04, 2010 9:30 AM
To: lucene-net-user@lucene.apache.org
Subject: Re: Lucene.NET Community Status

Close to lucene helps in two ways:
1) that when you google lucene you are most likely to find examples based on 
java and those examples are much more useful if you can cut and paste with only 
minor changes. If we diverge we would be better off changing names so that at 
least searches are not producing more confusion. But it would mean a large 
documentation and tutorial effort, for which we don't seem to have the manpower.

2) staying in sync with java code is a lot simpler. The further the APIs 
diverge the more knowledge of how everything works is required.
Not saying that knowledge wouldn't be useful, but given the history of 
commitment, it seems unlikely we're going to acquire that knowledge quickly.

And lets even say, we get 1 or 2 new dedicated contributors that really grok 
the internals and create the Idiomatic .NET port and keep it in sync. If they 
ever leave (and in most OS projects, that is a high likelihood), that 
specialized knowledge is gone and now you have two diverged products without 
the expertise to keep in sync, which with every commit that isn't ported gets 
worse.

I really would prefer to see a true .NET version and I know I'd be a lot more 
enthusiastic working on such a code base myself, but considering that this 
whole discussion started because there aren't enough active committers and the 
ones we have are overworked, i just don't see how a change in API buys us more 
value than risk.

Arne Claassen

MindTouch
San Diego, CA
http://twitter.com/sdether

On Nov 4, 2010, at 9:10 AM, Josh Handel wrote:

> Lucene is still open source, Developers can still be on the Lucene
> lists, and view the Lucene source (committed, patches, Jira issues)..
> How does being "close" provide MORE access then that? If Lucene was a
> closed source project, or didn't have a pubic repo, and didn't have
> mail-lists, or closed those mail lists to "only java and select .NET
> people" then I could understand being close.. But being "close" is
> mearly a conceptual thing here.. We don't actually have any extra
> access because we are Lucene.NET under the ASF..  Its not like if we
> emailed the java list with a question about implementation we would be
> turned away because we are "those .NET guys" not any more then we
> might already be :-P..
>
> And there are plenty of top level highly production products hosted on
> CodePlex, GitHub (Rails anyone!) and Source Forge (DotNetNuke before
> it went privateish)... The ASF provides process and a brand that
> creates trust.. But those things are not exclusive to the ASF nor does
> not being hosted by ASF mean those things don't exist in your
> project.. The merits of a specific project, and the quality of that
> project and its community is what drives trust.. the ASF works hard to
> maintain that in their projects.. In fact being ASF doesn't even
> guarantee that quality, for instance, my client ISN'T using an ASF
> project, because it's not up to snuff and still kind of buggy (and
> it's not in incubation, and it has a very active community)..
> Ergo, we are looking elsewhere to solve that technical problem (this
> was not Lucene by the way)...
>
> I guess what I am getting at is that:
>
> *  "being close" doesn't in practice provide special access because
> Lucene is itself an Open Source Project.
> * ASF is a good brand but brand alone does not = quality
> * NOT being ASF ! = No Quality or adoption.
>
> Josh
>
>
> -----Original Message-----
> From: Granroth, Neal V. [mailto:neal.granr...@thermofisher.com]
> Sent: Thursday, November 04, 2010 10:20 AM
> To: lucene-net-user@lucene.apache.org
> Subject: RE: Lucene.NET Community Status
>
> Others have expressed these points better than I am able.
>
> The algorithms that make up Lucene were expressed in the Java project
> and active development occurs there.  All ports of Lucene benefit from
> work done there and we all free to contribute to its on- going
> development.
> Separating the project would sever these beneficial connections.
>
> The ASF formal processes are also beneficial.  In the environment in
> which I work I can readily justify use of open source developed within
> its guidelines.  It is extremely difficult to justify use of open
> source from ad-hoc projects on CodePlex or elsewhere.
>
> So, yes it is pretty cut and dry.
>
> - Neal
>
> -----Original Message-----
> From: Josh Handel [mailto:josh.han...@catapultsystems.com]
> Sent: Thursday, November 04, 2010 10:01 AM
> To: lucene-net-user@lucene.apache.org
> Subject: RE: Lucene.NET Community Status
>
> I think there are more than 4 people interested in a .NET centric
> API.. Also, we have also talked about how moving to a more .NET
> friendly forge may infact attact new blood.. And given that Lucene is
> open source, and as of yet, I haven't seen the Lucene (Java) guys show
> any special deference to the .NET project.. I'm not sure where (beyond
> brand, some potential legal support, and a rigiours process) where
> being "close" to Lucene (Java) is providing value..
>
> Plus the Line by Line port is kinda what got us here in the first
> place.. if the project had done a conceptual port (line NUnit,
> NHibernate, NAnt) rather than line by line, then we would have built
> up the search expertise we are now lacking.. And we might not have
> burned out our top talent with an overly boring and laborious
> process..
>
> Just saying, it's not as cut and dry as you are selling it right now
> ;-).
>
> Josh
>
> -----Original Message-----
> From: Granroth, Neal V. [mailto:neal.granr...@thermofisher.com]
> Sent: Thursday, November 04, 2010 9:55 AM
> To: lucene-net-user@lucene.apache.org
> Subject: RE: Lucene.NET Community Status
>
> You are forgetting the people who test, report bugs, and assist new
> users.
> There are a lot more people involved then just the four you've listed.
>
> You are also forgetting the discussion that pointed out a move to
> CodePlex or other location outside ASF would mostly likely kill the
> project.  The project greatly benefits from close connections to
> Lucene (Java).
>
> - Neal
>
>
>
> -----Original Message-----
> From: Troy Howard [mailto:thowar...@gmail.com]
> Sent: Thursday, November 04, 2010 1:17 AM
> To: lucene-net-user@lucene.apache.org
> Subject: Re: Lucene.NET Community Status
>
> Phil,
>
> I've been unsuccessful at finding the specific reference in the ASF
> policy that covers this, but in a nutshell, yes, the code must be
> hosted by ASF, as well as the websites, docs, etc... This will prevent
> anything other than a mirror or branch existing on CodePlex.com.
>
> If the project leaves ASF this is not a concern of course, but will
> need to change it's name.
>
> There are currently four commiters (taken from
> http://lucene.apache.org/lucene.net/
>  ):
>
> George Aroush geo...@aroush.net
> Işık YİĞİT (DIGY) digyd...@gmail.com
> Doug Sale ds...@myspace-inc.com
> Michael Garski mgar...@myspace-inc.com
>
>
> Based on SVN logs commit activity, DIGY is the most recent committer.
> mgarksi's last commit was in 03/2010, and aroush in 12/2009.
>
> Currently only George and DIGY are showing interest in the project in
> the mailing lists. I would say that they are the people who would fit
> the bill of "core active project leaders".
>
> Thanks,
> Troy
>
>
> On Wed, Nov 3, 2010 at 9:52 PM, Phil Haack <phi...@microsoft.com>
> wrote:
>
>> I have a couple of naive questions, so forgive me. I see that Apache
>> projects use SVN http://www.apache.org/dev/version-control.html
>>
>> But is it required to host Apache projects in this svn? The reason I
>> ask is that a small change to hosting in a forge like CodePlex.com
>> would provide the project huge exposure to more .NET developers. You
>> could do this, but keep all other processes the same.
>>
>> Also, it makes keeping documentation really easy since they support
>> MetaWeblog API, and thus Windows Live Writer. It might seem like I'm
>> trying to hawk technology answers to organizational projects, but
>> you'd be surprised by how much reducing the friction of documentation
>> makes people more willing to write documentation. At least, for the
>> projects I work on, I'm more than willing to contribute documentation
>> when I can just point a blog client to it and publish.
>>
>> The other question I ask is who are the core active project leaders
>> for Lucene.NET? I'd really like to understand what they want. I and
>> others have many ideas, but it'd be helpful to understand what
>> direction they want to take things and what things are non-negotiable
>> so we have a framework to work with.
>>
>> Thanks,
>> Phil
>>
>>
>>
>>
>> ________________________________________
>> From: Troy Howard [thowar...@gmail.com]
>> Sent: Wednesday, November 03, 2010 8:47 PM
>> To: lucene-net-user@lucene.apache.org
>> Subject: Re: Lucene.NET Community Status
>>
>> All,
>>
>> I'm entering this conversation late as well. I'll apologize in
>> advance, as I know this will be lengthy.
>>
>> Briefly, I'll list my "credentials" and reasons for concern here:
>>
>> - I've been using Lucene.Net for many years since the early versions
>> and have built significant products for my company using it. Those
>> products are a core source of our revenue, which is measured in the
>> millions of $$s. The success of my company's products are directly
>> dependent on the success of the Lucene.Net project.
>>
>> - I run software development at my company and make the final
>> decisions about what we do and how we use our resources. The
>> developers here work on open source code on our clock. I would like
>> to have them start doing this for Lucene.Net. We have very smart and
>> productive people who could be a huge asset to this project. I hope
>> that the opportunity to leverage my company's team will not be
>> bypassed by the people running this project.
>>
>> - I have hacked extensively on the Lucene.Net internals to improve
>> performance in our product and have been manually maintaining our
>> local branch, merging in changes from the main project. I feel I have
>> enough knowledge of both the CS theory behind search engines and in
>> particular this codebase to not be intimidated by any aspect of the
>> needs of this project.
>>
>> - I started a similar kind of open source project in that it is a
>> .Net implementation of an existing C++ open source project and
>> struggled with the "syntactic port" vs "conceptual port" issue, and
>> so have perspective to provide on that discussion
>>
>>
>> Relationship To ASF and Lucene
>> -----------------------------------------------
>>
>> I'd like to address one thing upfront: This should definitely remain
>> an Apache Software Foundation project. As Grant and George have
>> stated clearly and accurately, this is a huge benefit for this
>> project in terms of it's credibility. This is not just because the
>> name is well respected. It's because of WHY the Apache name is so
>> well respected:
>> the processes and values of the Foundation set excellent standards
>> which encourages excellent code. This is not just my opinion, but can
>> be objectively proven by the enormous success of the Apache projects.
>> Complying with ASF's standards may be difficult, but it's extremely
>> valuable.
>>
>> I feel that Grant's recommendation of attempting to become a TLP at
>> Apache is the wrong direction. This should remain part of the Lucene
>> project. It is not unique in any substantial way from Lucene and thus
>> doesn't warrant being separate.
>>
>> Also, there was some mention of Lucene's file format and maintaining
>> that compatibility. This is essential. If this ever changes,
>> Lucene.Net will be useless. Being cross platform and having a very
>> stable on disk format is one of it's most compelling aspects.
>>
>>
>> Microsoft's Interest and Involvement
>> ---------------------------------------------------
>>
>> Another thing to mention: Phil Haack and Scott Hanselman, while both
>> are Microsoft employees, are more than just a representative of the
>> company they work for. They are both outstanding advocates of open
>> source software and have been instrumental in the change of attitude
>> that Microsoft has shown in recent years towards this community. The
>> fact that they have shown interest in this issue doesn't mean
>> Microsoft is interested, it means that this is a significant issue
>> for the .Net open source community. The fact they they work for
>> Microsoft means that they may be able to leverage resources and wield
>> clout from that vantage point that can benefit our community greatly.
>>
>> Regarding the question "What can Microsoft do to help"?.... I'll take
>> a somewhat radical stance here.
>>
>> We need Visual J# not to have been abandoned... We need IronJava,
>> like IronPython or IronRuby. We need a native, MS developed and
>> supported, fully optimized and performant compiler for plain old Java
>> code that runs on the .Net runtime and exposes Java libraries to
>> other .Net languages like F#, C#, VB, etc..
>>
>> There is a huge wealth of open source Java code out there, much of it
>> in the Apache project archives, which would all be "ported" at once.
>> Currently our community only gets access to Lucene.Net and iTextSharp
>> and a few other libraries where dedicated people like George put in
>> hard hours of direct syntax porting to implement these things in C#.
>>
>> We need more than that.
>>
>> I need Hadoop to run in .Net and HDFS, Hbase, Solr, Nutch, Tika, and
>> everything else in that ecosystem.
>>
>> My company is actually at a critical point now, where we are
>> considering abandoning .Net/WCF as our service layer platform, and
>> switching to Java, so that we can leverage those excellent Java
>> projects. Our business needs demand that we have what Hadoop does. It
>> will be easier for me to migrate my application code to Java than to
>> attempt to find equivalent functionality in the existing .Net world
>> or write my own framework, or port Hadoop.
>>
>> So, if there was ONE thing that Microsoft could do to *significantly*
>> help the .Net developer community, it would be providing a *real*
>> implementation of IronJava which would obviate the need to port code
>> completely, and simply allow those libraries and applications to run
>> in .Net natively.
>>
>> That said, assuming that Visual J# remains "retired" (see:
>> http://msdn.microsoft.com/en-us/vjsharp/default ) this project is one
>> of the few things we .Net developers have to work with.
>>
>>
>> Java or .Net Code Idioms
>> -------------------------------------
>>
>> I agree that moving to a codebase that is more .Net idiomatic will
>> both improve the user experience of end users of Lucene.Net but will
>> also improve the level of involvement that we can get from the
>> community. To put it simply, right now, hacking on the Lucene.Net
>> core code means you must understand Java idioms well, and how to
>> translate those to .Net. This is a skill set which is somewhat
>> uncommon.
>>
>> The "direct port" methodology also leads to code that is not fully
>> optimized for .Net. I have changed our local branch in a number of
>> significant ways, and improved performance significantly by doing so.
>> I didn't change APIs, I just change the implementations to be more
>> appropriate for .Net, and included generics.
>>
>> The test suite provided with Lucene/Lucene.Net is a great benefit in
>> that regard, and helped me ensure that my changes didn't break
>> functionality.
>> That said, the project need to improve in this regard. The classes
>> themselves need to be implemented in a more "testable" manner.
>> Abstract base classes instead of interfaces makes the code less
>> mockable and thus less testable. It also makes it harder to implement
>> customized components into the system. There are a number of things
>> that are sealed or internal that do not need to be.
>>
>> Lucene (for Java) was awesome because it ran well as managed code and
>> was elegant and efficient in Java's environment. Any port of Lucene
>> should *retain those features* as well. The library should make sense
>> and be implemented in the most elegant and efficient way that it can
>> be on the platform it's implemented on. Lucene.Net should not be a
>> port of Java Lucene to .Net, it should be an *implementation* of
>> Lucene running in .Net.
>> Porting
>> implies line-for-line similarity. Implementing just implies that the
>> features are all represented.
>>
>> For that reason, I support moving to a more idiomatic .Net
>> implementation, verified by the unit tests. The argument that "it
>> will require smart people"
>> to understand the core code -- that's a *GOOD* requirement. If you
>> don't understand how it works, conceptually, perhaps you should not
>> be attempting to  implementing it. Merely porting or auto-converting
>> code that "seems to be the same" and "passes the unit tests", without
>> really understanding the details is not a safe way to ensure correct
>> operation. What if there was a subtle difference between the two
>> syntaxes which led to differing (ie
>> incorrect) behaviour in some scenarios? What if the unit tests didn't
>> cover that scenario?
>>
>> Regarding the help and support provided by the Lucene community, and
>> the books and examples that provide code samples.. Changing to a more
>> .Net idiomatic codebase, even if that meant top level API changes,
>> would not be a substantial issue that would prevent a .Net developer
>> from understanding example code written in Java. If the API is
>> *basically* the same, but uses foo.Size instead of
>> foo.getSize()/foo.setSize() or List<T> instead of ArrayList... those
>> differences are minor and will not cause significant issues for
>> groking cross-language examples. People will still get it... and .Net
>> developers will be much happier.
>>
>>
>> So, take away is:
>> - My team and I will help hack on Lucene.Net and get paid to do it
>> - Lucene.Net should not change project status
>> - Microsoft should implement IronJava
>> - Moving towards idiomatic .Net code is the direction the project
>> should go and is not that big of a deal
>>
>>
>> Also, as a side-note. We're hiring in the Portland, Oregon area, and
>> could use developers who know Lucene.Net, and want to hack on it on
>> the clock.
>> Send me your resume.
>>
>>
>> Thanks,
>>
>> Troy Howard
>> Director of Software Development | discover-e Legal, LLC |
>> thowar...@gmail.com
>>
>
>



The information contained in this e-mail is of a confidential nature and is 
intended only for the addressee.  If you are not the intended addressee, any 
disclosure, copying or distribution by you is prohibited and may be unlawful.  
Disclosure to any party other than the addressee, whether inadvertent or 
otherwise, is not intended to waive privilege or confidentiality.  Internet 
communications are not secure and therefore Conde Nast does not accept legal 
responsibility for the contents of this message.  Any views or opinions 
expressed are those of the author.

The Conde Nast Publications Ltd (No. 226900), Vogue House, Hanover Square, 
London W1S 1JU

---------------------------------------------------------
Disclaimer 

This message and any attachments are confidential and/or privileged. If this 
has been sent to you in error, please do not use, retain or disclose them, and 
contact the sender as soon as possible.

Oxford Analytica Ltd
Registered in England: No. 1196703
5 Alfred Street, Oxford
United Kingdom, OX1 4EH
---------------------------------------------------------

Reply via email to