I have posted about how I do an initial port several times in the past. You can search in the mail archives, but here are some pointers:
http://www.mail-archive.com/lucene-net-dev@incubator.apache.org/msg00401.htm l http://www.mail-archive.com/lucene-u...@jakarta.apache.org/msg10860.html Just do a search on "JLCA" in the mailing list for more background. To sum-up, a port (specially an initial port) isn't much fun, is time consuming, and can't be divided into smaller tasks to be distributed. With all of my past ports, it use to take me a little over a month to complete one -- this includes getting up to 80% of NUnit tests passing -- for 2.9, it took me well over a month (with a lot more hours working on the project) and never got the chance to address any NUnit test failures. Why? The delta between 2.9 and previous releases was considerable (major refactoring in Lucene Java), and a lot of new features and files were added in 2.9 (code base grew by 30%). -- George -----Original Message----- From: Nicholas Paldino [.NET/C# MVP] [mailto:casper...@caspershouse.com] Sent: Monday, November 30, 2009 2:00 AM To: lucene-net-dev@incubator.apache.org Subject: RE: port of contrib packages from java Rob, I appreciate the input, but I feel that I might have been misunderstood. When I said a custom port, I meant for private consumption, not for public consumption. To answer your question, there are a number of benefits that will benefit from a .NET overhaul which have been outlined before. Some specific ones are the multi thread searchers (using Threads with calls to Join for synchronization is a bit of a dog, the ThreadPool can help there), replacement of ArrayList with List<T> (especially where the type parameter is a structure, there are performance issues due to boxing when using ArrayList instances). Those are the two off the top of my head which would fall within the purview of the Lucene.NET project, but no one seems to be doing. The replacement of ArrayList with List<T> isn't even a call site change in 99% of the instances, just a declaration change. This is probably the lowest-hanging fruit of all, and no one is doing it (Hashtables come to mind as well, but those would require call site changes, but could easily be handled with an extension method on IDictionary<TKey, TValue>). Why? To be honest, none of the answers are really satisfactory. The current commits that are being made are only being made if they help drive towards passing the test cases. I'm not saying that's not a goal to try and direct people towards, but being open source, you have to take what you can get when it comes to the work that people contribute (I'm not saying you have to ^accept it^ mind you). With that, not everyone wants to see a line-for-line port of Lucene from Java to .NET. People would like to address pain points that come with an implementation that isn't very .NET friendly, as well as an API that is unfriendly. I know the latter point is not up for discussion, but you indicate your desire for having the project fulfill your particular vision. I respect that vision, but I have one as well. I don't think it's unfair to say that there are others that share it as well. While I agree that catching up to Java is an achievable goal, there is no timeline for that goal (nor do you give one, mind you), and my impression is that it's not one that will be accomplished anytime soon. George implies (and if I am misrepresenting you George, I apologize, this is how I read your response) that is the case given the current level of contribution. All this being said, I see the discussion as moot, given my first statement about not making it available for public consumption. I simply want access to the process for my own individual consumption. Given the open source nature of the project, I don't see why it should be unavailable. I should also note that I am not looking to stop contributing to the project, but given the current direction that it is going, I have needs and desires for it I would like to address, and feel comfortable doing work that I know will not be shared with others, but which will fully attribute the original source of the work. That being said, are those tools and information on the process available? - Nicholas Paldino [.NET/C# MVP] -----Original Message----- From: Ron Grabowski [mailto:rongrabow...@yahoo.com] Sent: Monday, November 30, 2009 12:53 AM To: lucene-net-dev@incubator.apache.org Subject: Re: port of contrib packages from java I agree with George. Catching up to Java (within in a week or so of their SVN commits) seems like an achievable goal. The work being done on 2.9 is only about a month off the Java release. I'm concerned that having more of a .NET internal API would cause the project to slow down adopting new features. Take the PHP Lucene port for example...its sort of a port of Lucene but I couldn't find anything on the site detailing what version they branched from. I doubt they've incorporated the new features of 2.4, 2.9, etc. into their port or even have plans to be 3.0 compliant. I'd rather have a .NET port that is 10% slower but can more easily adapt new features from the parent project than a super-sweet .NET API that people have to bend over backwards to re-re-implement parent project features. Do we need to make the internal API more .NET-ish if people aren't going to use it much? Do you have specific areas that might benefit from a .NET overhaul? ----- Original Message ---- From: Nicholas Paldino [.NET/C# MVP] <casper...@caspershouse.com> To: lucene-net-dev@incubator.apache.org Sent: Sun, November 29, 2009 9:53:53 PM Subject: RE: port of contrib packages from java George, If that is the case, then where can I get a hold of the tools/process that is used to port over the java version to .NET? Being completely honest, I'd much rather just grab 3.0 from Java, do a port, and then have a custom version which is more to my liking implementation and API-wise. (still honoring the Apache license of course). While I very much like what Lucene does (and I am speaking in a general sense, not the .NET specific version), the .NET version suffers from this lack of resources, which unfortunately will keep it in this perpetual state. - Nick -----Original Message----- From: George Aroush [mailto:geo...@aroush.net] Sent: Sunday, November 29, 2009 12:40 AM To: lucene-net-dev@incubator.apache.org Subject: RE: port of contrib packages from java I'm not discouraging the use of .NET 3.5, or making Lucene.Net to be fully .NET compliant. I'm simply trying to set expectation as this is not the first time this subject came up. As you can see, it has been over 1 month since I committed the initial port of 2.9 and even with a good community help (never had this much help in any previous releases, it was just 2 or 3 of us) we still have about 14 NUnit tests failing! If the port was not line-per-line port, not only will we have to deal with NUnit tests, but we might very well have to deal with index format, compatibility, corruption, and threading issues to name some; the community will have to be well versed with Lucene's internals to address such issues. Are we ready for this? IMHO, no, we are not. I believe we need to first prove that we can maintain a port at a commit-per-commit level (or no more than a week behind Lucene Java), before we commit to be fully .NET compliant and take full advantage of it. -- George -----Original Message----- From: Nicholas Paldino [.NET/C# MVP] [mailto:casper...@caspershouse.com] Sent: Wednesday, November 25, 2009 10:09 PM To: lucene-net-dev@incubator.apache.org Subject: RE: port of contrib packages from java George, This brings up the question of whether or not work will be done to Lucene.NET to adhere to best practices in .NET development. I'm not even suggesting the public-facing API, but doing internal work. While I respect the desire to be able to be on a commit-by-commit basis with the Java project, there has been discussion in the past about moving to .NET 3.5 when Lucene 3.0 comes out (they are upgrading to a new version of the JVM at that point, from what I understand). Even if the decision to move to .NET 3.5 is made, I can't see the benefit if all that is desired for the Lucene.NET port is to be a mirror for the Java version because there aren't enough people that can maintain the project on a commit-per-commit basis. And while I don't have the metrics of those that have contributed, it doesn't seem like the project has the critical mass necessary to do this, which makes for a catch-22 situation. Basically, there aren't enough people to keep the project current on a commit-by-commit basis with the Java project, and that's one of the big reasons that I think people aren't contributing, because they are limited severely to this tenant to have literally line-by-line parity between the two code bases. It's also a tenant which serves the limitations of the resources that the project has available to it, as opposed to the betterment of the project itself. I'm not looking to bash the project or the people who have contributed (and I still want to contribute), but I don't see the point where the goal of matching the Java version consistently will happen, so it makes me ask if there shouldn't be a discussion about shifting the priorities of the project to address some of the pain points for the audience that is using the product now (some examples being a sloppy API from a .NET perspective, inefficient internal implementations and other such "goodies"). Perhaps this is something that should be put to a vote as well (not that I know who's vote would matter or count, but it's something you suggested for the ports of the contrib projects)? - Nick-----Original Message----- From: George Aroush [mailto:geo...@aroush.net] Sent: Monday, November 23, 2009 11:21 PM To: lucene-net-dev@incubator.apache.org Subject: RE: port of contrib packages from java Porting all of the code in contrib is going to be a challenge; there is a lot of code in there. So it makes sense to first port packages that gives us the most value (maybe via a vote). Also, what's ported now may no longer work with 2.9.1's Lucene.Net port; this is because contrib.Net port has not been kept up to date. And yes, virtually every project in contrib has a JUnit test associated with it, thus it can be used for validation of a project port. Regarding the .NET'es of ports, this has come up few times in the past, and it's tempting to want to make Lucene.Net more .NET'es. However. this is very hard to achieve without solid commitment and being at a commit-per-commit port with Lucene Java (i.e.: anytime a commit in Lucene Java happens, it must, within days, be ported over to Lucene.Net and committed). Many of the projects in contrib, the task to port them is much simpler than it is for the core Lucene code. However, here is where things get challenging. Any time you think about making a port more .NET'es, you must keep the following in mind: 1) It will be more work and harder to keep the code in sync with the Java version (per the above reasons), and 2) The code in contrib may no longer work with the code in Lucene core due to the .NET'es of the port (mainly public APIs). Thus, your effort at .NET'es of contrib port may be limited if Lucene core code isn't. What's the take away? Until when we can maintain commit-per-commit port with Lucene Java, trying to make Lucene.Net and / or contrib more .NET'es isn't realistic. -- George -----Original Message----- From: Eran Sevi [mailto:erans...@gmail.com] Sent: Monday, November 23, 2009 3:12 PM To: lucene-net-dev@incubator.apache.org Subject: Re: port of contrib packages from java Although some contrib packages might not be in use by any lucene .net user at the moment, I think we should port them all in accordance with the java version (it shouldn't be as hard as the core classes although I'm not sure there are any tests for them). When and if we'll diverge from the core java implementation in order to take benefit of .net and apply each patch as it comes, we can do the same for contrib which also sees much less traffic anyway. Eran On Mon, Nov 23, 2009 at 8:27 PM, Digy <digyd...@gmail.com> wrote: > I don't know whether there is such a preference for contribs or not, but > diverging from Java makes life harder for further ports. > Will someone be able to easily port the next release after your state of > art work following .NET best practices? > Or a new port from scratch? > > DIGY > > -----Original Message----- > From: Nicholas Paldino [.NET/C# MVP] [mailto:casper...@caspershouse.com] > Sent: Monday, November 23, 2009 7:11 PM > To: lucene-net-dev@incubator.apache.org > Subject: RE: port of contrib packages from java > > On a somewhat related note, these ports, do they adhere to the > tenants applied to the main trunk, or can they better follow .NET best > practices if one wants to apply them? > > - Nick > > -----Original Message----- > From: Eran Sevi [mailto:erans...@gmail.com] > Sent: Monday, November 23, 2009 8:45 AM > To: lucene-net-dev@incubator.apache.org > Subject: Re: port of contrib packages from java > > Thanks, > I'm more into the "queries" package. > If no one will beat me to it, I hope I can help and add it myself. > > How did you do the port? manually or using some conversion tools? > > Eran. > > On Mon, Nov 23, 2009 at 3:34 PM, Roger Chapman <ro...@stormid.com> wrote: > > > I've done a first pass port of the Spatial Contrib project : > > https://issues.apache.org/jira/browse/LUCENENET-199 > > > > Roger. > > > > -----Original Message----- > > From: Eran Sevi [mailto:erans...@gmail.com] > > Sent: 23 November 2009 13:17 > > To: lucene-net-dev@incubator.apache.org > > Subject: port of contrib packages from java > > > > Hi, > > Is there any thought to port all the contrib packages from java lucene > > after > > the porting of core 2.9.1 version is complete? > > Currently there are 23 packages in java contrib compared to only 7 > packages > > in .net contrib. > > > > Thanks, > > Eran. > > > >