RE: [Lucene.Net] Roadmap
Chris, Sorry, if you took my comments about pain of porting personally. That wasn't my intension. +1 for all your changes/divergences. I made/could have made them too. DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Monday, November 21, 2011 11:45 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Roadmap Digy, I used 2.9.4 trunk as the base for the 3.0.3 branch, but I looked to the code in 2.9.4g as a reference for many things, particularly the Support classes. We hit many of the same issues I'm sure, I moved some of the anonymous classes into a base class where you could inject functions, though not all could be replaced, nor did I replace all that could have been. Some of our code is different, I went for the option for WeakDictionary to be completely generic, as in wrapping a generic dictionary with WeakKeyT instead of wrapping the already existing WeakHashTable in support. In hindsight, it may have just been easier to convert the WeakHashTable to generic, but alas, I'm only realizing that now. There is a problem with my WeakDictionary, specifically the function that determines when to clean/compact the dictionary and remove the dead keys. I need a better heuristic of deciding when to run the clean. That's a performance issue though. Regarding the pain of porting, I am a changed man. It's nice, in a sad way, to know that I'm not the only one who experienced those difficulties. I used to be in the camp that porting code that differed from java wouldn't be difficult at all. However, now I code corrected! It threw me a curve-ball, for sure. I DO think a line-by-line port can definitely include the things talked about below, ie the changes to Dispose and the changes to IEnumerableT. Those changes, I thing, can be made without a heavy impact on the porting process. There was one fairly large change I opted to use that differed quite a bit from Java, however, and that was the use of the TPL in ParallelMultiSearcher. It was far easier to port this way, and I don't think it affects the porting process too much. Java uses a helper class defined at the bottom of the source file that handles it, I'm simply using a built-in one instead. I just need to be careful about it, it would be really easy to get carried away with it. Thanks, Christopher On Mon, Nov 21, 2011 at 1:20 PM, Digy digyd...@gmail.com wrote: Hi Chris, First of all, thank you for your great work on 3.0.3 branch. I suppose you took 2.9.4 as a code base to make 3.0.3 port since some of your problems are the same with those I faced in 2.9.4g branch. (e.g, Support/MemoryMappedDirectory.cs (but never used in core), IDisposable, introduction of some ActionTs, FuncTs , foreach instead of GetEnumerator/MoveNext, IEquatableT, WeakDictionaryT, SetT etc. ) Since I also used 3.0.3 as a reference, maybe we can use some of 2.9.4g's code in 3.0.3 when necessary(I haven't had time to look into 3.0.3 deeply) Just to ensure the coordination, maybe you should create a new issue in JIRA, so that people send patches to that issue instead of directly commiting. @Prescott, 2.9.4g is not behind of 2.9.4 in bug fixes features level. So, It is (I think) ready for another release.(I use it in all my projects since long). PS: Hearing the pain of porting codes that greatly differ from Java made me just smile( sorry for that:( ). Be ready for responses that get beyond the criticism between With all due respect Just my $0.02 paranthesis. DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Monday, November 21, 2011 10:19 PM To: lucene-net-dev@lucene.apache.org; casper...@caspershouse.com Subject: Re: [Lucene.Net] Roadmap Some of the Lucene classes have Dispose methods, well, ones that call Close (and that Close method may or may not call base.Close(), if needed or not). Virtual dispose methods can be dangerous only in that they're easy to implement wrong. However, it shouldn't be too bad, at least with a line-by-line port, as we would make the call to the base class whenever Lucene does, and that would (should) give us the same behavior, implemented properly. I'm not aware of differences in the JVM, regarding inheritance and base methods being called automatically, particularly Close methods. Slightly unrelated, another annoyance is the use of Java Iterators vs C# Enumerables. A lot of our code is there simply because there are Iterators, but it could be converted to Enumerables. The whole HasNext, Next vs C#'s MoveNext(), Current is annoying, but it's used all over in the base code, and would have to be changed there as well. Either way, I would like to push for that before 3.0.3 is relased. IMO, small changes like this still keep the code similar to the line-by-line port, in that it doesn't add any difficulties in
Re: [Lucene.Net] Roadmap
Digy, No worries. I wasn't taking them personally. You've been doing this for a lot longer than I have, but I didn't understand you pain until I had to go through it personally. :P Have you looked at Contrib in a while? There's a lot of projects that are in Java's Contrib that are not in Lucene.Net? Is this because there are some that can't easily (if at all) be ported over to .NET or just because they've been neglected? I'm trying to get a handle on what's important to port and what isn't. Figured someone with experience could help me with a starting point over deciding where to start with everything that's missing. Thanks, Christopher On Mon, Nov 21, 2011 at 2:13 PM, Digy digyd...@gmail.com wrote: Chris, Sorry, if you took my comments about pain of porting personally. That wasn't my intension. +1 for all your changes/divergences. I made/could have made them too. DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Monday, November 21, 2011 11:45 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Roadmap Digy, I used 2.9.4 trunk as the base for the 3.0.3 branch, but I looked to the code in 2.9.4g as a reference for many things, particularly the Support classes. We hit many of the same issues I'm sure, I moved some of the anonymous classes into a base class where you could inject functions, though not all could be replaced, nor did I replace all that could have been. Some of our code is different, I went for the option for WeakDictionary to be completely generic, as in wrapping a generic dictionary with WeakKeyT instead of wrapping the already existing WeakHashTable in support. In hindsight, it may have just been easier to convert the WeakHashTable to generic, but alas, I'm only realizing that now. There is a problem with my WeakDictionary, specifically the function that determines when to clean/compact the dictionary and remove the dead keys. I need a better heuristic of deciding when to run the clean. That's a performance issue though. Regarding the pain of porting, I am a changed man. It's nice, in a sad way, to know that I'm not the only one who experienced those difficulties. I used to be in the camp that porting code that differed from java wouldn't be difficult at all. However, now I code corrected! It threw me a curve-ball, for sure. I DO think a line-by-line port can definitely include the things talked about below, ie the changes to Dispose and the changes to IEnumerableT. Those changes, I thing, can be made without a heavy impact on the porting process. There was one fairly large change I opted to use that differed quite a bit from Java, however, and that was the use of the TPL in ParallelMultiSearcher. It was far easier to port this way, and I don't think it affects the porting process too much. Java uses a helper class defined at the bottom of the source file that handles it, I'm simply using a built-in one instead. I just need to be careful about it, it would be really easy to get carried away with it. Thanks, Christopher On Mon, Nov 21, 2011 at 1:20 PM, Digy digyd...@gmail.com wrote: Hi Chris, First of all, thank you for your great work on 3.0.3 branch. I suppose you took 2.9.4 as a code base to make 3.0.3 port since some of your problems are the same with those I faced in 2.9.4g branch. (e.g, Support/MemoryMappedDirectory.cs (but never used in core), IDisposable, introduction of some ActionTs, FuncTs , foreach instead of GetEnumerator/MoveNext, IEquatableT, WeakDictionaryT, SetT etc. ) Since I also used 3.0.3 as a reference, maybe we can use some of 2.9.4g's code in 3.0.3 when necessary(I haven't had time to look into 3.0.3 deeply) Just to ensure the coordination, maybe you should create a new issue in JIRA, so that people send patches to that issue instead of directly commiting. @Prescott, 2.9.4g is not behind of 2.9.4 in bug fixes features level. So, It is (I think) ready for another release.(I use it in all my projects since long). PS: Hearing the pain of porting codes that greatly differ from Java made me just smile( sorry for that:( ). Be ready for responses that get beyond the criticism between With all due respect Just my $0.02 paranthesis. DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Monday, November 21, 2011 10:19 PM To: lucene-net-dev@lucene.apache.org; casper...@caspershouse.com Subject: Re: [Lucene.Net] Roadmap Some of the Lucene classes have Dispose methods, well, ones that call Close (and that Close method may or may not call base.Close(), if needed or not). Virtual dispose methods can be dangerous only in that they're easy to implement wrong. However, it shouldn't be too bad, at least with a line-by-line port, as we would
RE: [Lucene.Net] Roadmap
My english isn't enough to understand this answer. I hope it is not related with employee-employer relationship as in the past. DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Tuesday, November 22, 2011 1:08 AM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Roadmap To clarify, it wasn't as much *difficult* as it was more *painful*. Above, I was inferring that it was more difficult that the rest of the code, which by comparison was easier. It wasn't painless to try and map where code changes were from the java classes into the .Net version. I prefer that style more for its readability and the niceties of working with a .Net style of Lucene, however as I said before, it slowed down significantly the porting process. I hope it didn't come across that I thought that it was bad code, because it's probably the most readable code we have in the Contrib at the moment. I want to make it clear that my intention right now is to get Lucene.Net up to date with Java. When I read the Java code, I understand its intent, and I make sure the ported code represents it. That takes enough time as it is, moving to try and figure out where the code went in Lucene.Net, since it wasn't a 1-1 map, was a MINOR annoyance, especially when you compare it to the issues I had dealing with the differences between the two languages, generics especialy. That being said, I don't have a problem with code being converted in a .Net idiomatic way, in fact, I welcome it, if it still allows the changes to be ported with minimal effort. I feel at this point in the project, there are some limitations to how far I'd like it to diverge. Anyway, my opinion, which may not be in agreement with the group as a whole, is that it would be better to bring the codebase up to date, or at least more up to date with java's, and then maintaining a version with a complete .net-concentric API. I feel this would beeasier, as porting Java's Lucene SVN commits by the week would be a relatively small workload. On Mon, Nov 21, 2011 at 2:41 PM, Troy Howard thowar...@gmail.com wrote: So, if we're getting back to the line by line port discussion... I think either side of this discussion is too extreme. For the case in point Chris just mentioned (which I'm not really sure what part was so difficult, as I ported that library in about 30 minutes from scratch)... anything is a pain if it sticks out in the middle of doing something completely different. The only reason we are able to do this line by line is due to the general similarity between Java and C#'s language syntax. If we were porting Lucene to a completely different language, that had a totally different syntax, the process would go like this: - Look at the original code, understand it's intent - Create similar code in the new language that expresses the same intent When applying changes: - Look at the original code diffs, understanding the intent of the change - Look at the ported code, and apply the changed logic's meaning in that language So, is just a different thought process. In my opinion, it's a better process because it forces the developer to actually think about the code instead of blindly converting syntax (possibly slightly incorrectly and introducing regressions). While there is a large volume of unit tests in Lucene, they are unfortunately not really the right tests and make porting much more difficult, because it's hard to verify that your ported code behaves the same because you can't just rely on the unit tests to verify your port. Therefore, it's safer to follow a process that requires the developer to delve deeply into the meaning of the code. Following a line-by-line process is convenient, but doesn't focus on meaning, which I think is more important. Thanks, Troy On Mon, Nov 21, 2011 at 2:23 PM, Christopher Currens currens.ch...@gmail.com wrote: Digy, No worries. I wasn't taking them personally. You've been doing this for a lot longer than I have, but I didn't understand you pain until I had to go through it personally. :P Have you looked at Contrib in a while? There's a lot of projects that are in Java's Contrib that are not in Lucene.Net? Is this because there are some that can't easily (if at all) be ported over to .NET or just because they've been neglected? I'm trying to get a handle on what's important to port and what isn't. Figured someone with experience could help me with a starting point over deciding where to start with everything that's missing. Thanks, Christopher On Mon, Nov 21, 2011 at 2:13 PM, Digy digyd...@gmail.com wrote: Chris, Sorry, if you took my comments about pain of porting personally. That wasn't my intension. +1 for all your changes/divergences. I made/could have made them too. DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent:
RE: [Lucene.Net] Roadmap
Chris, Now that you have spent some time dealing with the porting what is your view on creating a fully automated porting tool? Scott -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Monday, November 21, 2011 5:23 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Roadmap Digy, No worries. I wasn't taking them personally. You've been doing this for a lot longer than I have, but I didn't understand you pain until I had to go through it personally. :P Have you looked at Contrib in a while? There's a lot of projects that are in Java's Contrib that are not in Lucene.Net? Is this because there are some that can't easily (if at all) be ported over to .NET or just because they've been neglected? I'm trying to get a handle on what's important to port and what isn't. Figured someone with experience could help me with a starting point over deciding where to start with everything that's missing. Thanks, Christopher On Mon, Nov 21, 2011 at 2:13 PM, Digy digyd...@gmail.com wrote: Chris, Sorry, if you took my comments about pain of porting personally. That wasn't my intension. +1 for all your changes/divergences. I made/could have made them too. DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Monday, November 21, 2011 11:45 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Roadmap Digy, I used 2.9.4 trunk as the base for the 3.0.3 branch, but I looked to the code in 2.9.4g as a reference for many things, particularly the Support classes. We hit many of the same issues I'm sure, I moved some of the anonymous classes into a base class where you could inject functions, though not all could be replaced, nor did I replace all that could have been. Some of our code is different, I went for the option for WeakDictionary to be completely generic, as in wrapping a generic dictionary with WeakKeyT instead of wrapping the already existing WeakHashTable in support. In hindsight, it may have just been easier to convert the WeakHashTable to generic, but alas, I'm only realizing that now. There is a problem with my WeakDictionary, specifically the function that determines when to clean/compact the dictionary and remove the dead keys. I need a better heuristic of deciding when to run the clean. That's a performance issue though. Regarding the pain of porting, I am a changed man. It's nice, in a sad way, to know that I'm not the only one who experienced those difficulties. I used to be in the camp that porting code that differed from java wouldn't be difficult at all. However, now I code corrected! It threw me a curve-ball, for sure. I DO think a line-by-line port can definitely include the things talked about below, ie the changes to Dispose and the changes to IEnumerableT. Those changes, I thing, can be made without a heavy impact on the porting process. There was one fairly large change I opted to use that differed quite a bit from Java, however, and that was the use of the TPL in ParallelMultiSearcher. It was far easier to port this way, and I don't think it affects the porting process too much. Java uses a helper class defined at the bottom of the source file that handles it, I'm simply using a built-in one instead. I just need to be careful about it, it would be really easy to get carried away with it. Thanks, Christopher On Mon, Nov 21, 2011 at 1:20 PM, Digy digyd...@gmail.com wrote: Hi Chris, First of all, thank you for your great work on 3.0.3 branch. I suppose you took 2.9.4 as a code base to make 3.0.3 port since some of your problems are the same with those I faced in 2.9.4g branch. (e.g, Support/MemoryMappedDirectory.cs (but never used in core), IDisposable, introduction of some ActionTs, FuncTs , foreach instead of GetEnumerator/MoveNext, IEquatableT, WeakDictionaryT, SetT etc. ) Since I also used 3.0.3 as a reference, maybe we can use some of 2.9.4g's code in 3.0.3 when necessary(I haven't had time to look into 3.0.3 deeply) Just to ensure the coordination, maybe you should create a new issue in JIRA, so that people send patches to that issue instead of directly commiting. @Prescott, 2.9.4g is not behind of 2.9.4 in bug fixes features level. So, It is (I think) ready for another release.(I use it in all my projects since long). PS: Hearing the pain of porting codes that greatly differ from Java made me just smile( sorry for that:( ). Be ready for responses that get beyond the criticism between With all due respect Just my $0.02 paranthesis. DIGY -Original
Re: [Lucene.Net] Roadmap
Next to impossible/really, really hard. There are just some things that don't map quite right. Sharpen is great, but it seems you need to code written in a way that makes it easily convertible, and I don't see the folks at Lucene changing their coding style to do that. An example: 3.0.3 changes classes that inherited from util.Parameter, to java enums. Java enums are more similar to classes than they are in C#. They can have methods, fields, etc. I wound up converting them into enums with extension methods and/or static classes (usually to generate the enum). The way the code was written in Java, there's no way a automated tool could figure that out on its own, unless you had some sort of way to tell it what to do before hand. I imagine porting it by hand is probably easier, though it would be nice if there was a tool that would at least convert the syntax from Java to C#, as well as changing the naming scheme to a .NET compatible one. However, that only really helps if you're porting classes from scratch. It could, also, hide bugs, since it's possible, however unlikely, something could port perfectly, but not behave the same way. A class that has many calls to string.Substring is a good example of this. If the name of the function is changed to the .Net version (.substring to .Substring), it would compile no problems, but they are very different. C#'s signatures is Substring(int start, int count) while Java's is Substring(int startIndex, int endIndex). It may work hiding issues, it may throw an exception, depending on the data. A porting tool would probably know many of the differences like this, so it's sorta a moot point, in that this relies on the skills of the developer anyway. I may be wrong, but I just don't see this being a fully automated process ever. I would love to have something automated that at least fixed syntax errors, though this would only work on a line-by-line port. (Slightly off topic, I think we should always have a line-by-line port, even if our primary goals become focusing on a fully .Net style port) Either way, any sort of manual or partly-automated process would still require a lot of work to make sure things are ported correctly. I also think it's most manageable if it were a tool that did it on a file per file basis (instead of project level like Sharpen), for easy review and testing. Thanks, Christopher On Mon, Nov 21, 2011 at 3:30 PM, Scott Lombard lombardena...@gmail.comwrote: Chris, Now that you have spent some time dealing with the porting what is your view on creating a fully automated porting tool? Scott -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Monday, November 21, 2011 5:23 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Roadmap Digy, No worries. I wasn't taking them personally. You've been doing this for a lot longer than I have, but I didn't understand you pain until I had to go through it personally. :P Have you looked at Contrib in a while? There's a lot of projects that are in Java's Contrib that are not in Lucene.Net? Is this because there are some that can't easily (if at all) be ported over to .NET or just because they've been neglected? I'm trying to get a handle on what's important to port and what isn't. Figured someone with experience could help me with a starting point over deciding where to start with everything that's missing. Thanks, Christopher On Mon, Nov 21, 2011 at 2:13 PM, Digy digyd...@gmail.com wrote: Chris, Sorry, if you took my comments about pain of porting personally. That wasn't my intension. +1 for all your changes/divergences. I made/could have made them too. DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Monday, November 21, 2011 11:45 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Roadmap Digy, I used 2.9.4 trunk as the base for the 3.0.3 branch, but I looked to the code in 2.9.4g as a reference for many things, particularly the Support classes. We hit many of the same issues I'm sure, I moved some of the anonymous classes into a base class where you could inject functions, though not all could be replaced, nor did I replace all that could have been. Some of our code is different, I went for the option for WeakDictionary to be completely generic, as in wrapping a generic dictionary with WeakKeyT instead of wrapping the already existing WeakHashTable in support. In hindsight, it may have just been easier to convert the WeakHashTable to generic, but alas, I'm only realizing that now. There is a problem with my WeakDictionary, specifically the function that determines when to clean/compact the dictionary and remove the dead keys. I need a better heuristic of deciding when to run the clean. That's a
[Lucene.Net] [jira] [Created] (LUCENENET-457) Lucene locks directory with index after network related problems
Lucene locks directory with index after network related problems Key: LUCENENET-457 URL: https://issues.apache.org/jira/browse/LUCENENET-457 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Core Environment: Windows Server 2008 Reporter: Pavel Belousov I have a directory for my index in shared folder on another computer in the network. My service writes data to the index. Sometimes the service gets network related exceptions like The specified network name is no longer available.. After that the service cannot write anything to index because of lock, even if I delete write.lock file manually. I have done a research and have found that Lucene API has IndexWriter.Unlock() method, but in my case is does not work. I use NativeFSLockFactory class. Class NativeFSLock has private field LOCK_HELD with the list of current locks, but in my case (after network related issues) it has record with the lock (NativeFSLock uses it in Obtain() method) and I can't delete it through API. I suppose that method NativeFSLock.Release()(which is called from IndexWriter.Unlock()) should delete record from the field LOCK_HELD. May be I'm wrong and there is an appoarch to handle such problems? At the moment I have implemented the method which deletes the record from LOCK_HELD through reflection. Thanks a lot. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira