[jira] [Commented] (LUCENE-5716) Track file handle leaks (FileDescriptor, NIO Path SPI and Socket mostly).
[ https://issues.apache.org/jira/browse/LUCENE-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019608#comment-14019608 ] Dawid Weiss commented on LUCENE-5716: - {code} [junit4] Assumption #1: 'nightly' test group is disabled (@Nightly()) [junit4] 1 Tracked 76 objects, 76 closed, 0 open. [junit4] 1 FH: FileInputStream(C:\Work\lucene-solr-svn\LUCENE-5716\lucene\build\core\test\J0\.\temp\sort3028791570 611301125partition) [closed] [junit4] 1 FH: FileInputStream(C:\Work\lucene-solr-svn\LUCENE-5716\lucene\build\core\test\J0\.\temp\sort6926018466 304216348partition) [closed] [junit4] 1 FH: FileInputStream(C:\Work\lucene-solr-svn\LUCENE-5716\lucene\build\core\test\J0\.\temp\sort6765538901 416981889partition) [closed] ... {code} A list of all open/closed file handles is for debug/ showcase purposes only. Track file handle leaks (FileDescriptor, NIO Path SPI and Socket mostly). - Key: LUCENE-5716 URL: https://issues.apache.org/jira/browse/LUCENE-5716 Project: Lucene - Core Issue Type: Improvement Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5716) Track file handle leaks (FileDescriptor, NIO Path SPI and Socket mostly).
[ https://issues.apache.org/jira/browse/LUCENE-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019607#comment-14019607 ] Dawid Weiss commented on LUCENE-5716: - You can get a feeling of what it works like by switching to LUCENE-5716 and running: {code} cd lucene ant test-core -Dtests.slow=false -Dtestcase=TestOfflineSorter {code} Try opening a FileInputStream to something (out of the temporary folders) and leaving it unclosed. Track file handle leaks (FileDescriptor, NIO Path SPI and Socket mostly). - Key: LUCENE-5716 URL: https://issues.apache.org/jira/browse/LUCENE-5716 Project: Lucene - Core Issue Type: Improvement Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5738) NativeLock is release if Lock is closed after obtain failed
[ https://issues.apache.org/jira/browse/LUCENE-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019632#comment-14019632 ] Simon Willnauer commented on LUCENE-5738: - bq. Isn't this a valid aspect of the API as documented? well IMO that is not what is happening. Yes, the lock is released such that another process can acquire it but the same process can't and that is what makes this trappy and inconsistent IMO. However I think we have to bring back the static map to this and ramp up the tests otherwise this is too trappy NativeLock is release if Lock is closed after obtain failed --- Key: LUCENE-5738 URL: https://issues.apache.org/jira/browse/LUCENE-5738 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.8.1 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.9, 5.0 if you obtain the NativeFSLock and try to obtain it again in the same JVM and close if if it fails another process will be able to obtain it. This is pretty trappy though. If you execute the main class twice the problem becomes pretty obvious. {noformat} import org.apache.lucene.store.Lock; import org.apache.lucene.store.NativeFSLockFactory; import java.io.File; import java.io.IOException; public class TestLock { public static void main(String[] foo) throws IOException, InterruptedException { NativeFSLockFactory lockFactory = new NativeFSLockFactory(new File(/tmp)); Lock lock = lockFactory.makeLock(LOCK); if (lock.obtain()) { System.out.println(OBTAINED); } else { lock.close(); System.out.println(FAILED); } // try it again and close it if it fails lock = lockFactory.makeLock(LOCK); // this is a new lock if (lock.obtain()) { System.out.println(OBTAINED AGAIN); } else { lock.close(); // this releases the lock we obtained System.out.println(FAILED on Second); } Thread.sleep(Integer.MAX_VALUE); } } {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5738) NativeLock is release if Lock is closed after obtain failed
[ https://issues.apache.org/jira/browse/LUCENE-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019633#comment-14019633 ] Uwe Schindler commented on LUCENE-5738: --- Hi, in that case, with the help of [~rcmuir], we should re-add the process-wide static map of already aquired locks and check against it before aquiring a new lock. [~rboulton]: Java file handles never escape from the local process. We have only 2 use-cases: - A single process must prevent access to the same index. This is the issue here and can be solved by re-adding the static map of locks. - Another process must prevent access to the same index: This is currently no issue, because it cannot release the lock of other process. NativeLock is release if Lock is closed after obtain failed --- Key: LUCENE-5738 URL: https://issues.apache.org/jira/browse/LUCENE-5738 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.8.1 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.9, 5.0 if you obtain the NativeFSLock and try to obtain it again in the same JVM and close if if it fails another process will be able to obtain it. This is pretty trappy though. If you execute the main class twice the problem becomes pretty obvious. {noformat} import org.apache.lucene.store.Lock; import org.apache.lucene.store.NativeFSLockFactory; import java.io.File; import java.io.IOException; public class TestLock { public static void main(String[] foo) throws IOException, InterruptedException { NativeFSLockFactory lockFactory = new NativeFSLockFactory(new File(/tmp)); Lock lock = lockFactory.makeLock(LOCK); if (lock.obtain()) { System.out.println(OBTAINED); } else { lock.close(); System.out.println(FAILED); } // try it again and close it if it fails lock = lockFactory.makeLock(LOCK); // this is a new lock if (lock.obtain()) { System.out.println(OBTAINED AGAIN); } else { lock.close(); // this releases the lock we obtained System.out.println(FAILED on Second); } Thread.sleep(Integer.MAX_VALUE); } } {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5738) NativeLock is release if Lock is closed after obtain failed
[ https://issues.apache.org/jira/browse/LUCENE-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019637#comment-14019637 ] Uwe Schindler commented on LUCENE-5738: --- bq. well IMO that is not what is happening. It is waht is happening. The close() of the file handle is releasing the lock of the other file handle in the same process. This has nothing to do with the problem of the lock aquire failing, its just whats documented. So a single process should never try to aquire the lock multiple times through filesystem. We should only use the native lock between processes, for the single process case we should use the map. NativeLock is release if Lock is closed after obtain failed --- Key: LUCENE-5738 URL: https://issues.apache.org/jira/browse/LUCENE-5738 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.8.1 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.9, 5.0 if you obtain the NativeFSLock and try to obtain it again in the same JVM and close if if it fails another process will be able to obtain it. This is pretty trappy though. If you execute the main class twice the problem becomes pretty obvious. {noformat} import org.apache.lucene.store.Lock; import org.apache.lucene.store.NativeFSLockFactory; import java.io.File; import java.io.IOException; public class TestLock { public static void main(String[] foo) throws IOException, InterruptedException { NativeFSLockFactory lockFactory = new NativeFSLockFactory(new File(/tmp)); Lock lock = lockFactory.makeLock(LOCK); if (lock.obtain()) { System.out.println(OBTAINED); } else { lock.close(); System.out.println(FAILED); } // try it again and close it if it fails lock = lockFactory.makeLock(LOCK); // this is a new lock if (lock.obtain()) { System.out.println(OBTAINED AGAIN); } else { lock.close(); // this releases the lock we obtained System.out.println(FAILED on Second); } Thread.sleep(Integer.MAX_VALUE); } } {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5738) NativeLock is release if Lock is closed after obtain failed
[ https://issues.apache.org/jira/browse/LUCENE-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019642#comment-14019642 ] Uwe Schindler commented on LUCENE-5738: --- bq. Also it's really too bad our tests didn't detect this. The new test, we added in Lucene 4.7, cannot ctahc this, because it only tests that locking between processes work. But we should add the failing test above to a conventional unit test. It can be done with one process, i.e. a single unit test: - Aquire the lock - Try to aquire the lock a second time - closae the failed lock - try to aquire the lock a thrid time - it should still not work - relaease the master lock - try to aquire again - should now work NativeLock is release if Lock is closed after obtain failed --- Key: LUCENE-5738 URL: https://issues.apache.org/jira/browse/LUCENE-5738 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.8.1 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.9, 5.0 if you obtain the NativeFSLock and try to obtain it again in the same JVM and close if if it fails another process will be able to obtain it. This is pretty trappy though. If you execute the main class twice the problem becomes pretty obvious. {noformat} import org.apache.lucene.store.Lock; import org.apache.lucene.store.NativeFSLockFactory; import java.io.File; import java.io.IOException; public class TestLock { public static void main(String[] foo) throws IOException, InterruptedException { NativeFSLockFactory lockFactory = new NativeFSLockFactory(new File(/tmp)); Lock lock = lockFactory.makeLock(LOCK); if (lock.obtain()) { System.out.println(OBTAINED); } else { lock.close(); System.out.println(FAILED); } // try it again and close it if it fails lock = lockFactory.makeLock(LOCK); // this is a new lock if (lock.obtain()) { System.out.println(OBTAINED AGAIN); } else { lock.close(); // this releases the lock we obtained System.out.println(FAILED on Second); } Thread.sleep(Integer.MAX_VALUE); } } {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5738) NativeLock is release if Lock is closed after obtain failed
[ https://issues.apache.org/jira/browse/LUCENE-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019659#comment-14019659 ] Simon Willnauer commented on LUCENE-5738: - bq. It is waht is happening. The close() of the file handle is releasing the lock of the other file handle in the same process. This has nothing to do with the problem of the lock aquire failing, its just whats documented. So a single process should never try to aquire the lock multiple times through filesystem. We should only use the native lock between processes, for the single process case we should use the map. no you are not reading it right... what I am saying is that: * obtain lock * try again from same process, (this fails closes the channel release the nativ lock on the FS) * try again from same process (this fails again while another process can succeed) this is a problem here since it seems that the JVM prevents itself from obtaining it twice. That is what I am arguing about and we can't detect this with a unittest in a single process. NativeLock is release if Lock is closed after obtain failed --- Key: LUCENE-5738 URL: https://issues.apache.org/jira/browse/LUCENE-5738 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.8.1 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.9, 5.0 if you obtain the NativeFSLock and try to obtain it again in the same JVM and close if if it fails another process will be able to obtain it. This is pretty trappy though. If you execute the main class twice the problem becomes pretty obvious. {noformat} import org.apache.lucene.store.Lock; import org.apache.lucene.store.NativeFSLockFactory; import java.io.File; import java.io.IOException; public class TestLock { public static void main(String[] foo) throws IOException, InterruptedException { NativeFSLockFactory lockFactory = new NativeFSLockFactory(new File(/tmp)); Lock lock = lockFactory.makeLock(LOCK); if (lock.obtain()) { System.out.println(OBTAINED); } else { lock.close(); System.out.println(FAILED); } // try it again and close it if it fails lock = lockFactory.makeLock(LOCK); // this is a new lock if (lock.obtain()) { System.out.println(OBTAINED AGAIN); } else { lock.close(); // this releases the lock we obtained System.out.println(FAILED on Second); } Thread.sleep(Integer.MAX_VALUE); } } {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5738) NativeLock is release if Lock is closed after obtain failed
[ https://issues.apache.org/jira/browse/LUCENE-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-5738: Attachment: LUCENE-5738.patch here is a patch that adds a more elegant static map and fixed the stress test to fail without the map. It now passes but we should stress it a bit more and run it for a while. NativeLock is release if Lock is closed after obtain failed --- Key: LUCENE-5738 URL: https://issues.apache.org/jira/browse/LUCENE-5738 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.8.1 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.9, 5.0 Attachments: LUCENE-5738.patch if you obtain the NativeFSLock and try to obtain it again in the same JVM and close if if it fails another process will be able to obtain it. This is pretty trappy though. If you execute the main class twice the problem becomes pretty obvious. {noformat} import org.apache.lucene.store.Lock; import org.apache.lucene.store.NativeFSLockFactory; import java.io.File; import java.io.IOException; public class TestLock { public static void main(String[] foo) throws IOException, InterruptedException { NativeFSLockFactory lockFactory = new NativeFSLockFactory(new File(/tmp)); Lock lock = lockFactory.makeLock(LOCK); if (lock.obtain()) { System.out.println(OBTAINED); } else { lock.close(); System.out.println(FAILED); } // try it again and close it if it fails lock = lockFactory.makeLock(LOCK); // this is a new lock if (lock.obtain()) { System.out.println(OBTAINED AGAIN); } else { lock.close(); // this releases the lock we obtained System.out.println(FAILED on Second); } Thread.sleep(Integer.MAX_VALUE); } } {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5738) NativeLock is release if Lock is closed after obtain failed
[ https://issues.apache.org/jira/browse/LUCENE-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019675#comment-14019675 ] Michael McCandless commented on LUCENE-5738: +1, patch looks good I wouldn't use the world elegant: this is an annoying JDK trap that we have to work around! You changed the test to also sometimes try to acquire the lock (in the same process) when it already has the lock, and assert that 2nd acquire failed. With the bug, this would release the first lock the process had acquired, allowing the 2nd process to illegally obtain the lock, and then the server fails. Can you rename LOCK_MARKERS to LOCK_HELD (and clearMarkedLock to clearHeldLock) to make it clear that's the purpose of the map? Also the indent of clearMarkedLock is off. NativeLock is release if Lock is closed after obtain failed --- Key: LUCENE-5738 URL: https://issues.apache.org/jira/browse/LUCENE-5738 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.8.1 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.9, 5.0 Attachments: LUCENE-5738.patch if you obtain the NativeFSLock and try to obtain it again in the same JVM and close if if it fails another process will be able to obtain it. This is pretty trappy though. If you execute the main class twice the problem becomes pretty obvious. {noformat} import org.apache.lucene.store.Lock; import org.apache.lucene.store.NativeFSLockFactory; import java.io.File; import java.io.IOException; public class TestLock { public static void main(String[] foo) throws IOException, InterruptedException { NativeFSLockFactory lockFactory = new NativeFSLockFactory(new File(/tmp)); Lock lock = lockFactory.makeLock(LOCK); if (lock.obtain()) { System.out.println(OBTAINED); } else { lock.close(); System.out.println(FAILED); } // try it again and close it if it fails lock = lockFactory.makeLock(LOCK); // this is a new lock if (lock.obtain()) { System.out.println(OBTAINED AGAIN); } else { lock.close(); // this releases the lock we obtained System.out.println(FAILED on Second); } Thread.sleep(Integer.MAX_VALUE); } } {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5716) Track file handle leaks (FileDescriptor, NIO Path SPI and Socket mostly).
[ https://issues.apache.org/jira/browse/LUCENE-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019679#comment-14019679 ] Michael McCandless commented on LUCENE-5716: This looks really awesome! Track file handle leaks (FileDescriptor, NIO Path SPI and Socket mostly). - Key: LUCENE-5716 URL: https://issues.apache.org/jira/browse/LUCENE-5716 Project: Lucene - Core Issue Type: Improvement Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5738) NativeLock is release if Lock is closed after obtain failed
[ https://issues.apache.org/jira/browse/LUCENE-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-5738: Attachment: LUCENE-5738.patch updated patch including CHANGES.TXT entry. NativeLock is release if Lock is closed after obtain failed --- Key: LUCENE-5738 URL: https://issues.apache.org/jira/browse/LUCENE-5738 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.8.1 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.9, 5.0 Attachments: LUCENE-5738.patch, LUCENE-5738.patch if you obtain the NativeFSLock and try to obtain it again in the same JVM and close if if it fails another process will be able to obtain it. This is pretty trappy though. If you execute the main class twice the problem becomes pretty obvious. {noformat} import org.apache.lucene.store.Lock; import org.apache.lucene.store.NativeFSLockFactory; import java.io.File; import java.io.IOException; public class TestLock { public static void main(String[] foo) throws IOException, InterruptedException { NativeFSLockFactory lockFactory = new NativeFSLockFactory(new File(/tmp)); Lock lock = lockFactory.makeLock(LOCK); if (lock.obtain()) { System.out.println(OBTAINED); } else { lock.close(); System.out.println(FAILED); } // try it again and close it if it fails lock = lockFactory.makeLock(LOCK); // this is a new lock if (lock.obtain()) { System.out.println(OBTAINED AGAIN); } else { lock.close(); // this releases the lock we obtained System.out.println(FAILED on Second); } Thread.sleep(Integer.MAX_VALUE); } } {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5738) NativeLock is release if Lock is closed after obtain failed
[ https://issues.apache.org/jira/browse/LUCENE-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019688#comment-14019688 ] Michael McCandless commented on LUCENE-5738: +1, looks great. Thanks Simon! NativeLock is release if Lock is closed after obtain failed --- Key: LUCENE-5738 URL: https://issues.apache.org/jira/browse/LUCENE-5738 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.8.1 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.9, 5.0 Attachments: LUCENE-5738.patch, LUCENE-5738.patch if you obtain the NativeFSLock and try to obtain it again in the same JVM and close if if it fails another process will be able to obtain it. This is pretty trappy though. If you execute the main class twice the problem becomes pretty obvious. {noformat} import org.apache.lucene.store.Lock; import org.apache.lucene.store.NativeFSLockFactory; import java.io.File; import java.io.IOException; public class TestLock { public static void main(String[] foo) throws IOException, InterruptedException { NativeFSLockFactory lockFactory = new NativeFSLockFactory(new File(/tmp)); Lock lock = lockFactory.makeLock(LOCK); if (lock.obtain()) { System.out.println(OBTAINED); } else { lock.close(); System.out.println(FAILED); } // try it again and close it if it fails lock = lockFactory.makeLock(LOCK); // this is a new lock if (lock.obtain()) { System.out.println(OBTAINED AGAIN); } else { lock.close(); // this releases the lock we obtained System.out.println(FAILED on Second); } Thread.sleep(Integer.MAX_VALUE); } } {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5738) NativeLock is release if Lock is closed after obtain failed
[ https://issues.apache.org/jira/browse/LUCENE-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019689#comment-14019689 ] Uwe Schindler commented on LUCENE-5738: --- Hi Simon, looks good. Sorry for confusing responses. The issue is, as you say, the combination of 2 issues, of which one is a real bug in the JDK: - About the patch: I like it, much simplier than before Robert's cleanup. One small thing: In Java 7 we should use IOUtils only if required, for the use case here (the finally block) we can use a cool trick. The pros for doing it like that is, that no Exceptions may be supressed, the are recorded: Replace: {code:java} } finally { if (obtained == false) { // not successful - clear up and move out clearMarkedLock(path); final FileChannel toClose = channel; channel = null; IOUtils.closeWhileHandlingException(toClose); } } {code} by {code:java} } finally { if (obtained == false) { // not successful - clear up and move out try (FileChannel toClose = channel) { clearMarkedLock(path); channel = null; } } } {code} I will look into the LockStressTest, but for now it looks fine. You can run the stress tester for very long time using some system properties (like running it the whole night). NativeLock is release if Lock is closed after obtain failed --- Key: LUCENE-5738 URL: https://issues.apache.org/jira/browse/LUCENE-5738 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.8.1 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.9, 5.0 Attachments: LUCENE-5738.patch, LUCENE-5738.patch if you obtain the NativeFSLock and try to obtain it again in the same JVM and close if if it fails another process will be able to obtain it. This is pretty trappy though. If you execute the main class twice the problem becomes pretty obvious. {noformat} import org.apache.lucene.store.Lock; import org.apache.lucene.store.NativeFSLockFactory; import java.io.File; import java.io.IOException; public class TestLock { public static void main(String[] foo) throws IOException, InterruptedException { NativeFSLockFactory lockFactory = new NativeFSLockFactory(new File(/tmp)); Lock lock = lockFactory.makeLock(LOCK); if (lock.obtain()) { System.out.println(OBTAINED); } else { lock.close(); System.out.println(FAILED); } // try it again and close it if it fails lock = lockFactory.makeLock(LOCK); // this is a new lock if (lock.obtain()) { System.out.println(OBTAINED AGAIN); } else { lock.close(); // this releases the lock we obtained System.out.println(FAILED on Second); } Thread.sleep(Integer.MAX_VALUE); } } {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-5738) NativeLock is release if Lock is closed after obtain failed
[ https://issues.apache.org/jira/browse/LUCENE-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019689#comment-14019689 ] Uwe Schindler edited comment on LUCENE-5738 at 6/6/14 8:55 AM: --- Hi Simon, looks good. Sorry for confusing responses. The issue is, as you say, the combination of 2 issues, of which one is a real bug in the JDK: - The lock is released if you call close on any FileChannel - The JDK bug where we cannot requaire the lock in the same JVM, because the JVM thinks its still held, but it isnt, so another process can get it. About the patch: I like it, much simplier than before Robert's cleanup. One small thing: In Java 7 we should use IOUtils only if required, for the use case here (the finally block) we can use a cool trick. The pros for doing it like that is, that no Exceptions may be supressed, the are recorded: Replace: {code:java} } finally { if (obtained == false) { // not successful - clear up and move out clearMarkedLock(path); final FileChannel toClose = channel; channel = null; IOUtils.closeWhileHandlingException(toClose); } } {code} by {code:java} } finally { if (obtained == false) { // not successful - clear up and move out try (FileChannel toClose = channel) { clearMarkedLock(path); channel = null; } } } {code} I will look into the LockStressTest, but for now it looks fine. You can run the stress tester for very long time using some system properties (like running it the whole night). was (Author: thetaphi): Hi Simon, looks good. Sorry for confusing responses. The issue is, as you say, the combination of 2 issues, of which one is a real bug in the JDK: - About the patch: I like it, much simplier than before Robert's cleanup. One small thing: In Java 7 we should use IOUtils only if required, for the use case here (the finally block) we can use a cool trick. The pros for doing it like that is, that no Exceptions may be supressed, the are recorded: Replace: {code:java} } finally { if (obtained == false) { // not successful - clear up and move out clearMarkedLock(path); final FileChannel toClose = channel; channel = null; IOUtils.closeWhileHandlingException(toClose); } } {code} by {code:java} } finally { if (obtained == false) { // not successful - clear up and move out try (FileChannel toClose = channel) { clearMarkedLock(path); channel = null; } } } {code} I will look into the LockStressTest, but for now it looks fine. You can run the stress tester for very long time using some system properties (like running it the whole night). NativeLock is release if Lock is closed after obtain failed --- Key: LUCENE-5738 URL: https://issues.apache.org/jira/browse/LUCENE-5738 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.8.1 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.9, 5.0 Attachments: LUCENE-5738.patch, LUCENE-5738.patch if you obtain the NativeFSLock and try to obtain it again in the same JVM and close if if it fails another process will be able to obtain it. This is pretty trappy though. If you execute the main class twice the problem becomes pretty obvious. {noformat} import org.apache.lucene.store.Lock; import org.apache.lucene.store.NativeFSLockFactory; import java.io.File; import java.io.IOException; public class TestLock { public static void main(String[] foo) throws IOException, InterruptedException { NativeFSLockFactory lockFactory = new NativeFSLockFactory(new File(/tmp)); Lock lock = lockFactory.makeLock(LOCK); if (lock.obtain()) { System.out.println(OBTAINED); } else { lock.close(); System.out.println(FAILED); } // try it again and close it if it fails lock = lockFactory.makeLock(LOCK); // this is a new lock if (lock.obtain()) { System.out.println(OBTAINED AGAIN); } else { lock.close(); // this releases the lock we obtained System.out.println(FAILED on Second); } Thread.sleep(Integer.MAX_VALUE); } } {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5738) NativeLock is release if Lock is closed after obtain failed
[ https://issues.apache.org/jira/browse/LUCENE-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019690#comment-14019690 ] ASF subversion and git services commented on LUCENE-5738: - Commit 1600827 from [~simonw] in branch 'dev/trunk' [ https://svn.apache.org/r1600827 ] LUCENE-5738: Ensure NativeFSLock prevents opening the file channel twice if lock is held NativeLock is release if Lock is closed after obtain failed --- Key: LUCENE-5738 URL: https://issues.apache.org/jira/browse/LUCENE-5738 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.8.1 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.9, 5.0 Attachments: LUCENE-5738.patch, LUCENE-5738.patch if you obtain the NativeFSLock and try to obtain it again in the same JVM and close if if it fails another process will be able to obtain it. This is pretty trappy though. If you execute the main class twice the problem becomes pretty obvious. {noformat} import org.apache.lucene.store.Lock; import org.apache.lucene.store.NativeFSLockFactory; import java.io.File; import java.io.IOException; public class TestLock { public static void main(String[] foo) throws IOException, InterruptedException { NativeFSLockFactory lockFactory = new NativeFSLockFactory(new File(/tmp)); Lock lock = lockFactory.makeLock(LOCK); if (lock.obtain()) { System.out.println(OBTAINED); } else { lock.close(); System.out.println(FAILED); } // try it again and close it if it fails lock = lockFactory.makeLock(LOCK); // this is a new lock if (lock.obtain()) { System.out.println(OBTAINED AGAIN); } else { lock.close(); // this releases the lock we obtained System.out.println(FAILED on Second); } Thread.sleep(Integer.MAX_VALUE); } } {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5738) NativeLock is release if Lock is closed after obtain failed
[ https://issues.apache.org/jira/browse/LUCENE-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019701#comment-14019701 ] ASF subversion and git services commented on LUCENE-5738: - Commit 1600831 from [~simonw] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1600831 ] LUCENE-5738: Ensure NativeFSLock prevents opening the file channel twice if lock is held NativeLock is release if Lock is closed after obtain failed --- Key: LUCENE-5738 URL: https://issues.apache.org/jira/browse/LUCENE-5738 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.8.1 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.9, 5.0 Attachments: LUCENE-5738.patch, LUCENE-5738.patch if you obtain the NativeFSLock and try to obtain it again in the same JVM and close if if it fails another process will be able to obtain it. This is pretty trappy though. If you execute the main class twice the problem becomes pretty obvious. {noformat} import org.apache.lucene.store.Lock; import org.apache.lucene.store.NativeFSLockFactory; import java.io.File; import java.io.IOException; public class TestLock { public static void main(String[] foo) throws IOException, InterruptedException { NativeFSLockFactory lockFactory = new NativeFSLockFactory(new File(/tmp)); Lock lock = lockFactory.makeLock(LOCK); if (lock.obtain()) { System.out.println(OBTAINED); } else { lock.close(); System.out.println(FAILED); } // try it again and close it if it fails lock = lockFactory.makeLock(LOCK); // this is a new lock if (lock.obtain()) { System.out.println(OBTAINED AGAIN); } else { lock.close(); // this releases the lock we obtained System.out.println(FAILED on Second); } Thread.sleep(Integer.MAX_VALUE); } } {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5738) NativeLock is release if Lock is closed after obtain failed
[ https://issues.apache.org/jira/browse/LUCENE-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer resolved LUCENE-5738. - Resolution: Fixed committed to 4x and trunk NativeLock is release if Lock is closed after obtain failed --- Key: LUCENE-5738 URL: https://issues.apache.org/jira/browse/LUCENE-5738 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.8.1 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.9, 5.0 Attachments: LUCENE-5738.patch, LUCENE-5738.patch if you obtain the NativeFSLock and try to obtain it again in the same JVM and close if if it fails another process will be able to obtain it. This is pretty trappy though. If you execute the main class twice the problem becomes pretty obvious. {noformat} import org.apache.lucene.store.Lock; import org.apache.lucene.store.NativeFSLockFactory; import java.io.File; import java.io.IOException; public class TestLock { public static void main(String[] foo) throws IOException, InterruptedException { NativeFSLockFactory lockFactory = new NativeFSLockFactory(new File(/tmp)); Lock lock = lockFactory.makeLock(LOCK); if (lock.obtain()) { System.out.println(OBTAINED); } else { lock.close(); System.out.println(FAILED); } // try it again and close it if it fails lock = lockFactory.makeLock(LOCK); // this is a new lock if (lock.obtain()) { System.out.println(OBTAINED AGAIN); } else { lock.close(); // this releases the lock we obtained System.out.println(FAILED on Second); } Thread.sleep(Integer.MAX_VALUE); } } {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5738) NativeLock is release if Lock is closed after obtain failed
[ https://issues.apache.org/jira/browse/LUCENE-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019704#comment-14019704 ] Uwe Schindler commented on LUCENE-5738: --- Commit was a little bit too fast for me... But no problem at all, was just a comment to not use IOUtils if not really needed. The code pattern I posted before is from official Oracle JDK code (used like that now at many places). It is somehow a misuse of try-with-resources, but very elegant. NativeLock is release if Lock is closed after obtain failed --- Key: LUCENE-5738 URL: https://issues.apache.org/jira/browse/LUCENE-5738 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.8.1 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.9, 5.0 Attachments: LUCENE-5738.patch, LUCENE-5738.patch if you obtain the NativeFSLock and try to obtain it again in the same JVM and close if if it fails another process will be able to obtain it. This is pretty trappy though. If you execute the main class twice the problem becomes pretty obvious. {noformat} import org.apache.lucene.store.Lock; import org.apache.lucene.store.NativeFSLockFactory; import java.io.File; import java.io.IOException; public class TestLock { public static void main(String[] foo) throws IOException, InterruptedException { NativeFSLockFactory lockFactory = new NativeFSLockFactory(new File(/tmp)); Lock lock = lockFactory.makeLock(LOCK); if (lock.obtain()) { System.out.println(OBTAINED); } else { lock.close(); System.out.println(FAILED); } // try it again and close it if it fails lock = lockFactory.makeLock(LOCK); // this is a new lock if (lock.obtain()) { System.out.println(OBTAINED AGAIN); } else { lock.close(); // this releases the lock we obtained System.out.println(FAILED on Second); } Thread.sleep(Integer.MAX_VALUE); } } {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5738) NativeLock is release if Lock is closed after obtain failed
[ https://issues.apache.org/jira/browse/LUCENE-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019706#comment-14019706 ] Simon Willnauer commented on LUCENE-5738: - I saw your comment. I didn't want to change the RT behaviour with respect to closing the channel in this issue. I think it's fine to convert places in a different issue. NativeLock is release if Lock is closed after obtain failed --- Key: LUCENE-5738 URL: https://issues.apache.org/jira/browse/LUCENE-5738 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.8.1 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.9, 5.0 Attachments: LUCENE-5738.patch, LUCENE-5738.patch if you obtain the NativeFSLock and try to obtain it again in the same JVM and close if if it fails another process will be able to obtain it. This is pretty trappy though. If you execute the main class twice the problem becomes pretty obvious. {noformat} import org.apache.lucene.store.Lock; import org.apache.lucene.store.NativeFSLockFactory; import java.io.File; import java.io.IOException; public class TestLock { public static void main(String[] foo) throws IOException, InterruptedException { NativeFSLockFactory lockFactory = new NativeFSLockFactory(new File(/tmp)); Lock lock = lockFactory.makeLock(LOCK); if (lock.obtain()) { System.out.println(OBTAINED); } else { lock.close(); System.out.println(FAILED); } // try it again and close it if it fails lock = lockFactory.makeLock(LOCK); // this is a new lock if (lock.obtain()) { System.out.println(OBTAINED AGAIN); } else { lock.close(); // this releases the lock we obtained System.out.println(FAILED on Second); } Thread.sleep(Integer.MAX_VALUE); } } {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5738) NativeLock is release if Lock is closed after obtain failed
[ https://issues.apache.org/jira/browse/LUCENE-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019707#comment-14019707 ] Uwe Schindler commented on LUCENE-5738: --- Wanted to note: Had the LockVerify test running with {{ant test-lock-factory -Dlockverify.count=50}} running for an hour on Windows, works. NativeLock is release if Lock is closed after obtain failed --- Key: LUCENE-5738 URL: https://issues.apache.org/jira/browse/LUCENE-5738 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.8.1 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.9, 5.0 Attachments: LUCENE-5738.patch, LUCENE-5738.patch if you obtain the NativeFSLock and try to obtain it again in the same JVM and close if if it fails another process will be able to obtain it. This is pretty trappy though. If you execute the main class twice the problem becomes pretty obvious. {noformat} import org.apache.lucene.store.Lock; import org.apache.lucene.store.NativeFSLockFactory; import java.io.File; import java.io.IOException; public class TestLock { public static void main(String[] foo) throws IOException, InterruptedException { NativeFSLockFactory lockFactory = new NativeFSLockFactory(new File(/tmp)); Lock lock = lockFactory.makeLock(LOCK); if (lock.obtain()) { System.out.println(OBTAINED); } else { lock.close(); System.out.println(FAILED); } // try it again and close it if it fails lock = lockFactory.makeLock(LOCK); // this is a new lock if (lock.obtain()) { System.out.println(OBTAINED AGAIN); } else { lock.close(); // this releases the lock we obtained System.out.println(FAILED on Second); } Thread.sleep(Integer.MAX_VALUE); } } {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6147) QUERYHANDLER in Solr's admin GUI should be REQUESTHANDLER
[ https://issues.apache.org/jira/browse/SOLR-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019726#comment-14019726 ] Stefan Matheis (steffkes) commented on SOLR-6147: - The UI doesn't name things (at least not in this case) .. we just take what the mbeans-handler contains. afaik as i understand how things are going, this is coming from the {{Category}} enum in {{org.apache.solr.core.SolrInfoMBean}} If we change that, the UI will reflect the changes instantly. QUERYHANDLER in Solr's admin GUI should be REQUESTHANDLER - Key: SOLR-6147 URL: https://issues.apache.org/jira/browse/SOLR-6147 Project: Solr Issue Type: Improvement Components: web gui Reporter: David Smiley Priority: Minor In the admin UI, go to Plugins / Stats where you'll see a QUERYHANDLER section. That should be called REQUESTHANDLER, and likewise the URL to it should match. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5741) IndexWriter.tryDeleteDocument does not work
Zhuravskiy Vitaliy created LUCENE-5741: -- Summary: IndexWriter.tryDeleteDocument does not work Key: LUCENE-5741 URL: https://issues.apache.org/jira/browse/LUCENE-5741 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.8, 4.7, 4.6, 4.5, 4.3, 4.9 Reporter: Zhuravskiy Vitaliy Priority: Critical I am using fresha and opened reader. One segement and 3 documents in index. tryDeleteDocument always return false, i deep into your code, and see follow, that segmentInfos.indexOf(info) always return -1 because org.apache.lucene.index.SegmentInfos doesnot have equals method, see screenshoot for more inforamtion http://postimg.org/image/jvtezvqnn/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5741) IndexWriter.tryDeleteDocument does not work
[ https://issues.apache.org/jira/browse/LUCENE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019733#comment-14019733 ] Zhuravskiy Vitaliy commented on LUCENE-5741: Same problem discussion here http://lucene.472066.n3.nabble.com/Unexpected-returning-false-from-IndexWriter-tryDeleteDocument-td4107633.html IndexWriter.tryDeleteDocument does not work --- Key: LUCENE-5741 URL: https://issues.apache.org/jira/browse/LUCENE-5741 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.3, 4.5, 4.6, 4.7, 4.8, 4.9 Reporter: Zhuravskiy Vitaliy Priority: Critical I am using fresha and opened reader. One segement and 3 documents in index. tryDeleteDocument always return false, i deep into your code, and see follow, that segmentInfos.indexOf(info) always return -1 because org.apache.lucene.index.SegmentInfos doesnot have equals method, see screenshoot for more inforamtion http://postimg.org/image/jvtezvqnn/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-5741) IndexWriter.tryDeleteDocument does not work
[ https://issues.apache.org/jira/browse/LUCENE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-5741: -- Assignee: Michael McCandless IndexWriter.tryDeleteDocument does not work --- Key: LUCENE-5741 URL: https://issues.apache.org/jira/browse/LUCENE-5741 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.3, 4.5, 4.6, 4.7, 4.8, 4.9 Reporter: Zhuravskiy Vitaliy Assignee: Michael McCandless Priority: Critical I am using fresha and opened reader. One segement and 3 documents in index. tryDeleteDocument always return false, i deep into your code, and see follow, that segmentInfos.indexOf(info) always return -1 because org.apache.lucene.index.SegmentInfos doesnot have equals method, see screenshoot for more inforamtion http://postimg.org/image/jvtezvqnn/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5741) IndexWriter.tryDeleteDocument does not work
[ https://issues.apache.org/jira/browse/LUCENE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019759#comment-14019759 ] Michael McCandless commented on LUCENE-5741: Can you re-test with a newer version of Lucene? This may just be a dup of LUCENE-4986 (fixed in 4.3.1 but it looks like you saw this issue in 4.3.0). IndexWriter.tryDeleteDocument does not work --- Key: LUCENE-5741 URL: https://issues.apache.org/jira/browse/LUCENE-5741 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.3, 4.5, 4.6, 4.7, 4.8, 4.9 Reporter: Zhuravskiy Vitaliy Assignee: Michael McCandless Priority: Critical I am using fresha and opened reader. One segement and 3 documents in index. tryDeleteDocument always return false, i deep into your code, and see follow, that segmentInfos.indexOf(info) always return -1 because org.apache.lucene.index.SegmentInfos doesnot have equals method, see screenshoot for more inforamtion http://postimg.org/image/jvtezvqnn/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4799) SQLEntityProcessor for zipper join
[ https://issues.apache.org/jira/browse/SOLR-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019763#comment-14019763 ] Alexandre Rafalovitch commented on SOLR-4799: - Morphline is now part of Apache Solr distribution. That probably points the direction in which this will go. At the same time, in nearly a year, no further improvements on DIH were done as far as I know. So, perhaps this addition should be committed even if it is not ideal. SQLEntityProcessor for zipper join -- Key: SOLR-4799 URL: https://issues.apache.org/jira/browse/SOLR-4799 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Mikhail Khludnev Priority: Minor Labels: dih Attachments: SOLR-4799.patch DIH is mostly considered as a playground tool, and real usages end up with SolrJ. I want to contribute few improvements target DIH performance. This one provides performant approach for joining SQL Entities with miserable memory at contrast to http://wiki.apache.org/solr/DataImportHandler#CachedSqlEntityProcessor The idea is: * parent table is explicitly ordered by it’s PK in SQL * children table is explicitly ordered by parent_id FK in SQL * children entity processor joins ordered resultsets by ‘zipper’ algorithm. Do you think it’s worth to contribute it into DIH? cc: [~goksron] [~jdyer] -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5742) size of frq record is very huge in the table
Maheshwara Reddy created LUCENE-5742: Summary: size of frq record is very huge in the table Key: LUCENE-5742 URL: https://issues.apache.org/jira/browse/LUCENE-5742 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.2.1 Environment: Linux, Jboss, Oracle database Reporter: Maheshwara Reddy Hi, We are trying the use the lucene with jdbcstore, In production, the 'frq' and 'tis' record size had grown too huge, We are unable to resize the these records to smaller size eventhough we try to call reindex and optimize using different mergeFactor and maxMergeDocs -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5688) NumericDocValues fields with sparse data can be compressed better
[ https://issues.apache.org/jira/browse/LUCENE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Thacker updated LUCENE-5688: -- Attachment: LUCENE-5688.patch bq. Can you perhaps make the Consumer/Producer package-private? I think only the Format needs to be public? Done. Lucene45DocValues producer and consumer are public. We could fix that in some other issue? In SparseDocValuesProducer I have implemented 2 ways to get numerics - getNumericUsingBinarySearch - uses the binary search approach getNumericUsingHashMap - uses a hash map based approach Test passes for both. So I think we should benchmark both approaches. based on the results we could pick one approach or even have both on them and pick the right stratergy using data from the benchmark results. bq. It's based on Solr's HashDocSet which I modify to act as an int-to-long map. I can share the code here if you want. That will be great. We can replace the HashMap in getNumericUsingHashMap with this. From what I understand this is how we can run luceneutil benchmark tests - - python setup.py -prepareTrunk - svn checkout https://svn.apache.org/repos/asf/lucene/dev/trunk patch - Apply the patch in this checkout - We need a task file and then call searchBench.py with -index and -search NumericDocValues fields with sparse data can be compressed better -- Key: LUCENE-5688 URL: https://issues.apache.org/jira/browse/LUCENE-5688 Project: Lucene - Core Issue Type: Improvement Reporter: Varun Thacker Priority: Minor Attachments: LUCENE-5688.patch, LUCENE-5688.patch, LUCENE-5688.patch I ran into this problem where I had a dynamic field in Solr and indexed data into lots of fields. For each field only a few documents had actual values and the remaining documents the default value ( 0 ) got indexed. Now when I merge segments, the index size jumps up. For example I have 10 segments - Each with 1 DV field. When I merge segments into 1 that segment will contain all 10 DV fields with lots if 0s. This was the motivation behind trying to come up with a compression for a use case like this. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5716) Track file handle leaks (FileDescriptor, NIO Path SPI and Socket mostly).
[ https://issues.apache.org/jira/browse/LUCENE-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019820#comment-14019820 ] Shalin Shekhar Mangar commented on LUCENE-5716: --- Very cool! Track file handle leaks (FileDescriptor, NIO Path SPI and Socket mostly). - Key: LUCENE-5716 URL: https://issues.apache.org/jira/browse/LUCENE-5716 Project: Lucene - Core Issue Type: Improvement Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5741) IndexWriter.tryDeleteDocument does not work
[ https://issues.apache.org/jira/browse/LUCENE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhuravskiy Vitaliy updated LUCENE-5741: --- Description: I am using fresha and opened reader. One segement and 3 documents in index. tryDeleteDocument always return false, i deep into your code, and see follow, that segmentInfos.indexOf(info) always return -1 because org.apache.lucene.index.SegmentInfoPerCommit doesnot have equals method, see screenshoot for more inforamtion http://postimg.org/image/jvtezvqnn/ was: I am using fresha and opened reader. One segement and 3 documents in index. tryDeleteDocument always return false, i deep into your code, and see follow, that segmentInfos.indexOf(info) always return -1 because org.apache.lucene.index.SegmentInfos doesnot have equals method, see screenshoot for more inforamtion http://postimg.org/image/jvtezvqnn/ IndexWriter.tryDeleteDocument does not work --- Key: LUCENE-5741 URL: https://issues.apache.org/jira/browse/LUCENE-5741 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.3, 4.5, 4.6, 4.7, 4.8, 4.9 Reporter: Zhuravskiy Vitaliy Assignee: Michael McCandless Priority: Critical I am using fresha and opened reader. One segement and 3 documents in index. tryDeleteDocument always return false, i deep into your code, and see follow, that segmentInfos.indexOf(info) always return -1 because org.apache.lucene.index.SegmentInfoPerCommit doesnot have equals method, see screenshoot for more inforamtion http://postimg.org/image/jvtezvqnn/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6137) Managed Schema / Schemaless and SolrCloud concurrency issues
[ https://issues.apache.org/jira/browse/SOLR-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019872#comment-14019872 ] Yonik Seeley commented on SOLR-6137: Gregory, the schemaless case is covered I think... we either re-run the type guessing logic on replicas, or pass in a requireSchema=version parameter that would be checked by the replicas. Races between different nodes are handled via optimistic concurrency. Managed Schema / Schemaless and SolrCloud concurrency issues Key: SOLR-6137 URL: https://issues.apache.org/jira/browse/SOLR-6137 Project: Solr Issue Type: Bug Components: Schema and Analysis, SolrCloud Reporter: Gregory Chanan This is a follow up to a message on the mailing list, linked here: http://mail-archives.apache.org/mod_mbox/lucene-dev/201406.mbox/%3CCAKfebOOcMeVEb010SsdcH8nta%3DyonMK5R7dSFOsbJ_tnre0O7w%40mail.gmail.com%3E The Managed Schema integration with SolrCloud seems pretty limited. The issue I'm running into is variants of the issue that schema changes are not pushed to all shards/replicas synchronously. So, for example, I can make the following two requests: 1) add a field to the collection on server1 using the Schema API 2) add a document with the new field, the document is routed to a core on server2 Then, there appears to be a race between when the document is processed by the core on server2 and when the core on server2, via the ZkIndexSchemaReader, gets the new schema. If the document is processed first, I get a 400 error because the field doesn't exist. This is easily reproducible by adding a sleep to the ZkIndexSchemaReader's processing. I hit a similar issue with Schemaless: the distributed request handler sends out the document updates, but there is no guarantee that the other shards/replicas see the schema changes made by the update.chain. Another issue I noticed today: making multiple schema API calls concurrently can block; that is, one may get through and the other may infinite loop. So, for reference, the issues include: 1) Schema API changes return success before all cores are updated; subsequent calls attempting to use new schema may fail 2) Schemaless changes may fail on replicas/other shards for the same reason 3) Concurrent Schema API changes may block From Steve Rowe on the mailing list: {quote} For Schema API users, delaying a couple of seconds after adding fields before using them should workaround this problem. While not ideal, I think schema field additions are rare enough in the Solr collection lifecycle that this is not a huge problem. For schemaless users, the picture is worse, as you noted. Immediate distribution of documents triggering schema field addition could easily prove problematic. Maybe we need a schema update blocking mode, where after the ZK schema node watch is triggered, all new request processing is halted until the schema is finished downloading/parsing/swapping out? (Such a mode should help Schema API users too.) {quote} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5741) IndexWriter.tryDeleteDocument does not work
[ https://issues.apache.org/jira/browse/LUCENE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhuravskiy Vitaliy updated LUCENE-5741: --- Affects Version/s: (was: 4.9) 4.8.1 IndexWriter.tryDeleteDocument does not work --- Key: LUCENE-5741 URL: https://issues.apache.org/jira/browse/LUCENE-5741 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.3, 4.5, 4.6, 4.7, 4.8, 4.8.1 Reporter: Zhuravskiy Vitaliy Assignee: Michael McCandless Priority: Critical I am using fresha and opened reader. One segement and 3 documents in index. tryDeleteDocument always return false, i deep into your code, and see follow, that segmentInfos.indexOf(info) always return -1 because org.apache.lucene.index.SegmentInfoPerCommit doesnot have equals method, see screenshoot for more inforamtion http://postimg.org/image/jvtezvqnn/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5741) IndexWriter.tryDeleteDocument does not work
[ https://issues.apache.org/jira/browse/LUCENE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019879#comment-14019879 ] Zhuravskiy Vitaliy commented on LUCENE-5741: Hi, Michael, on 4.3.1 bug still present. Same situation on the 4.8.1 (http://postimg.org/image/zfb2ww6x5/). I wrote: segmentInfos.indexOf(info) always return -1 because org.apache.lucene.index.SegmentInfos does not have equals method Please read http://docs.oracle.com/javase/7/docs/api/java/util/List.html#indexOf(java.lang.Object) , indexOf uses equals method of an object. On the screenshoot (http://postimg.org/image/jvtezvqnn/) we have two instance of org.apache.lucene.index.SegmentInfoPerCommit (SegmentCommitInfo on 4.8.1), which has same values, but different for ArrayList, because has not overrided equals method (for example like in the SegmentInfo). Class must have overriden equals method, if you you want ArrayList.indexOf works (ArrayList.indexOf into org.apache.lucene.index.SegmentInfos#indexOf) Screenshoot of debuger on 4.8.1 with 4.3.1 index http://postimg.org/image/zfb2ww6x5/ IndexWriter.tryDeleteDocument does not work --- Key: LUCENE-5741 URL: https://issues.apache.org/jira/browse/LUCENE-5741 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.3, 4.5, 4.6, 4.7, 4.8, 4.8.1 Reporter: Zhuravskiy Vitaliy Assignee: Michael McCandless Priority: Critical I am using fresha and opened reader. One segement and 3 documents in index. tryDeleteDocument always return false, i deep into your code, and see follow, that segmentInfos.indexOf(info) always return -1 because org.apache.lucene.index.SegmentInfoPerCommit doesnot have equals method, see screenshoot for more inforamtion http://postimg.org/image/jvtezvqnn/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-5741) IndexWriter.tryDeleteDocument does not work
[ https://issues.apache.org/jira/browse/LUCENE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019879#comment-14019879 ] Zhuravskiy Vitaliy edited comment on LUCENE-5741 at 6/6/14 2:35 PM: Hi, Michael, on 4.3.1 bug still present. Same situation on the 4.8.1 (http://postimg.org/image/zfb2ww6x5/). I wrote: segmentInfos.indexOf(info) always return -1 because org.apache.lucene.index.SegmentInfos does not have equals method Please read http://docs.oracle.com/javase/7/docs/api/java/util/List.html#indexOf(java.lang.Object) , indexOf uses equals method of an object. On the screenshoot (http://postimg.org/image/jvtezvqnn/) we have two instance of org.apache.lucene.index.SegmentInfoPerCommit (SegmentCommitInfo on 4.8.1), which has same values, but different for ArrayList, because has not overrided equals method (for example like in the SegmentInfo). Class must have overriden equals method, if you you want ArrayList.indexOf works (ArrayList.indexOf into org.apache.lucene.index.SegmentInfos#indexOf) Screenshoot of debuger on 4.8.1 with 4.3.1 index http://postimg.org/image/zfb2ww6x5/ I saw solr-lucene-core sources there uses hashmap instead list, and works. But in current source code class SegmentCommitInfo need to be equals method. was (Author: zhuravskiy.vs): Hi, Michael, on 4.3.1 bug still present. Same situation on the 4.8.1 (http://postimg.org/image/zfb2ww6x5/). I wrote: segmentInfos.indexOf(info) always return -1 because org.apache.lucene.index.SegmentInfos does not have equals method Please read http://docs.oracle.com/javase/7/docs/api/java/util/List.html#indexOf(java.lang.Object) , indexOf uses equals method of an object. On the screenshoot (http://postimg.org/image/jvtezvqnn/) we have two instance of org.apache.lucene.index.SegmentInfoPerCommit (SegmentCommitInfo on 4.8.1), which has same values, but different for ArrayList, because has not overrided equals method (for example like in the SegmentInfo). Class must have overriden equals method, if you you want ArrayList.indexOf works (ArrayList.indexOf into org.apache.lucene.index.SegmentInfos#indexOf) Screenshoot of debuger on 4.8.1 with 4.3.1 index http://postimg.org/image/zfb2ww6x5/ IndexWriter.tryDeleteDocument does not work --- Key: LUCENE-5741 URL: https://issues.apache.org/jira/browse/LUCENE-5741 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.3, 4.5, 4.6, 4.7, 4.8, 4.8.1 Reporter: Zhuravskiy Vitaliy Assignee: Michael McCandless Priority: Critical I am using fresha and opened reader. One segement and 3 documents in index. tryDeleteDocument always return false, i deep into your code, and see follow, that segmentInfos.indexOf(info) always return -1 because org.apache.lucene.index.SegmentInfoPerCommit doesnot have equals method, see screenshoot for more inforamtion http://postimg.org/image/jvtezvqnn/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5741) IndexWriter.tryDeleteDocument does not work
[ https://issues.apache.org/jira/browse/LUCENE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019882#comment-14019882 ] Michael McCandless commented on LUCENE-5741: The SegmentInfoPerCommit from the reader should be the very same instance as the one inside IndexWriter's SegmentInfos, and so the Object.equals impl (using ==) works here. So we need to figure out why in your case == returns false. Can you describe where the reader that you are passing in came from? It must be a near-real-time reader in order to work. Or can you make a small test case? IndexWriter.tryDeleteDocument does not work --- Key: LUCENE-5741 URL: https://issues.apache.org/jira/browse/LUCENE-5741 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.3, 4.5, 4.6, 4.7, 4.8, 4.8.1 Reporter: Zhuravskiy Vitaliy Assignee: Michael McCandless Priority: Critical I am using fresha and opened reader. One segement and 3 documents in index. tryDeleteDocument always return false, i deep into your code, and see follow, that segmentInfos.indexOf(info) always return -1 because org.apache.lucene.index.SegmentInfoPerCommit doesnot have equals method, see screenshoot for more inforamtion http://postimg.org/image/jvtezvqnn/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4799) SQLEntityProcessor for zipper join
[ https://issues.apache.org/jira/browse/SOLR-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019885#comment-14019885 ] James Dyer commented on SOLR-4799: -- bq. At the same time, in nearly a year, no further improvements on DIH were done as far as I know. So, perhaps this addition should be committed even if it is not ideal. I would say the exact opposite. There are not very many people maintaining DIH code, and those of us that do are lazy about it. Therefore, let's not stuff more big features in and make more code to maintain when there are no maintainers. I have code here in JIRA that I've used in production for years that I've been unwilling to commit just for this very reason. I do see Flume as a great DIH replacement, but from the documentation I don't see it having very great RDBMS support? I think a lot of DIH users are using it to import data from an RDBMS into Solr. SQLEntityProcessor for zipper join -- Key: SOLR-4799 URL: https://issues.apache.org/jira/browse/SOLR-4799 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Mikhail Khludnev Priority: Minor Labels: dih Attachments: SOLR-4799.patch DIH is mostly considered as a playground tool, and real usages end up with SolrJ. I want to contribute few improvements target DIH performance. This one provides performant approach for joining SQL Entities with miserable memory at contrast to http://wiki.apache.org/solr/DataImportHandler#CachedSqlEntityProcessor The idea is: * parent table is explicitly ordered by it’s PK in SQL * children table is explicitly ordered by parent_id FK in SQL * children entity processor joins ordered resultsets by ‘zipper’ algorithm. Do you think it’s worth to contribute it into DIH? cc: [~goksron] [~jdyer] -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6018) Solr DataImportHandler not finding dynamic fields
[ https://issues.apache.org/jira/browse/SOLR-6018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019924#comment-14019924 ] Aaron LaBella commented on SOLR-6018: - Any comments/updates on this issue? Solr DataImportHandler not finding dynamic fields - Key: SOLR-6018 URL: https://issues.apache.org/jira/browse/SOLR-6018 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 4.7 Reporter: Aaron LaBella Fix For: 4.9 Attachments: 0001-dih-config-check-for-dynamic-fields.patch There is an issue with *org.apache.solr.handler.dataimport.DocBuilder:addFields* around ~line 643. The logic currently says see if you can find the field from the schema, ie: {code:title=DocBuilder.java|borderStyle=solid} SchemaField sf = schema.getFieldOrNull(key); {code} and, if not found, go ask DIHConfiguration to find it, ie: {code:title=DocBuilder.java|borderStyle=solid} sf = config.getSchemaField(key); {code} The latter call takes into account case-insensitivity, which is a big deal since some databases, ie: DB2, upper case all the resulting column names. In order to not modify solr-core (ie: the match logic in IndexSchema), I'm attaching a patch that makes DIHConfiguration apply the same case-insensitive logic to the DynamicFields. Without this patch, dynamic fields will not be added to the index unless you declare them like this: {code:xml} dynamicField name=*_S type=string indexed=true stored=true / {code} (note the capital S) which is in-consistent with what I believe to be solr schema conventions to have all the schema fields as lower-case. Thanks. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6143) Bad facet counts from CollapsingQParserPlugin
[ https://issues.apache.org/jira/browse/SOLR-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020051#comment-14020051 ] Joel Bernstein commented on SOLR-6143: -- Let's move this discussion to the users list. I'll provide my answer here and close out the ticket unless a bug comes up during the discussion on the list. I'll repost this to users list also. The CollapsingQParserPlugin should give you the same facet counts as group.truncate. You're using group.facets, which the CollapsingQParserplugin doesn't yet support. I think this would be an excellent feature, so we could make a jira ticket to add this feature. Bad facet counts from CollapsingQParserPlugin -- Key: SOLR-6143 URL: https://issues.apache.org/jira/browse/SOLR-6143 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.8.1 Environment: UNIX Tomcat 7.0.33 SOLR 4.8.1 16 GB RAM Reporter: David Fennessey I'm noticing a very weird bug using the CollapsingQParserPlugin. We tried to use this plugin when we realized that faceting on the groups would take a ridiculous amount of time. To its credit, it works very quickly, however the facet counts that it gives are incorrect. We have a smallish index of about 200k documents with about with about 50k distinct groups within it. When we use the group implementation (group=truegroup.field=PrSKUgroup.facet=true) which I believe this attempts to emulate, the facet counts are totally correct. When we use the field collapsing implementation, it will show an incorrect count for the non-filtered query, but when we go to the filtered query, the facet count corrects itself and matches the document count. Here are some SOLR responses: solrslave01:8983/index/select?q=classIDs:12fl=PrSKUfq={!collapse%20field=PrSKU}facet=truefacet.field=at_12_wood_tone The facet field will return int name=Dark Wood867/int int name=Medium Wood441/int int name=Light Wood253/int When I actually apply a filter query like so: solrslave01:8983/index/select?q=classIDs:12fl=PrSKUfq={!collapse%20field=PrSKU}facet=truefacet.field=at_12_wood_tonefq=at_12_wood_tone:%22Light%20Wood%22 I actually pull back 270 results and the facet updates itself with the correct number at the bottom int name=Light Wood270/int int name=Dark Wood68/int int name=Medium Wood66/int If this were the same number pre and post filter query I would assume that it was simply my data that was bad, however I've pored over this for the better part of a day and I'm pretty sure it's the plugin. For reference, this field that I'm faceting on is a multiValued field, however I have noticed the exact same behavior on non multiValued fields (such as price). I can provide any other details you might need -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6143) Bad facet counts from CollapsingQParserPlugin
[ https://issues.apache.org/jira/browse/SOLR-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein resolved SOLR-6143. -- Resolution: Not a Problem Bad facet counts from CollapsingQParserPlugin -- Key: SOLR-6143 URL: https://issues.apache.org/jira/browse/SOLR-6143 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.8.1 Environment: UNIX Tomcat 7.0.33 SOLR 4.8.1 16 GB RAM Reporter: David Fennessey I'm noticing a very weird bug using the CollapsingQParserPlugin. We tried to use this plugin when we realized that faceting on the groups would take a ridiculous amount of time. To its credit, it works very quickly, however the facet counts that it gives are incorrect. We have a smallish index of about 200k documents with about with about 50k distinct groups within it. When we use the group implementation (group=truegroup.field=PrSKUgroup.facet=true) which I believe this attempts to emulate, the facet counts are totally correct. When we use the field collapsing implementation, it will show an incorrect count for the non-filtered query, but when we go to the filtered query, the facet count corrects itself and matches the document count. Here are some SOLR responses: solrslave01:8983/index/select?q=classIDs:12fl=PrSKUfq={!collapse%20field=PrSKU}facet=truefacet.field=at_12_wood_tone The facet field will return int name=Dark Wood867/int int name=Medium Wood441/int int name=Light Wood253/int When I actually apply a filter query like so: solrslave01:8983/index/select?q=classIDs:12fl=PrSKUfq={!collapse%20field=PrSKU}facet=truefacet.field=at_12_wood_tonefq=at_12_wood_tone:%22Light%20Wood%22 I actually pull back 270 results and the facet updates itself with the correct number at the bottom int name=Light Wood270/int int name=Dark Wood68/int int name=Medium Wood66/int If this were the same number pre and post filter query I would assume that it was simply my data that was bad, however I've pored over this for the better part of a day and I'm pretty sure it's the plugin. For reference, this field that I'm faceting on is a multiValued field, however I have noticed the exact same behavior on non multiValued fields (such as price). I can provide any other details you might need -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6147) Rename SolrInfoMBean.Category.QUERYHANDLER to REQUESTHANDLER
[ https://issues.apache.org/jira/browse/SOLR-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-6147: --- Fix Version/s: 5.0 Summary: Rename SolrInfoMBean.Category.QUERYHANDLER to REQUESTHANDLER (was: QUERYHANDLER in Solr's admin GUI should be REQUESTHANDLER) Rename SolrInfoMBean.Category.QUERYHANDLER to REQUESTHANDLER Key: SOLR-6147 URL: https://issues.apache.org/jira/browse/SOLR-6147 Project: Solr Issue Type: Improvement Components: web gui Reporter: David Smiley Priority: Minor Fix For: 5.0 In the admin UI, go to Plugins / Stats where you'll see a QUERYHANDLER section. That should be called REQUESTHANDLER, and likewise the URL to it should match. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6147) Rename SolrInfoMBean.Category.QUERYHANDLER to REQUESTHANDLER
[ https://issues.apache.org/jira/browse/SOLR-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020135#comment-14020135 ] David Smiley commented on SOLR-6147: Thanks for the clarification; I renamed the issue accordingly. I marked this for 5.0 due to the likely desire to retain backwards compatibility on 4x. Rename SolrInfoMBean.Category.QUERYHANDLER to REQUESTHANDLER Key: SOLR-6147 URL: https://issues.apache.org/jira/browse/SOLR-6147 Project: Solr Issue Type: Improvement Components: web gui Reporter: David Smiley Priority: Minor Fix For: 5.0 In the admin UI, go to Plugins / Stats where you'll see a QUERYHANDLER section. That should be called REQUESTHANDLER, and likewise the URL to it should match. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4799) SQLEntityProcessor for zipper join
[ https://issues.apache.org/jira/browse/SOLR-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020207#comment-14020207 ] Mikhail Khludnev commented on SOLR-4799: Despite of many things described above, I *agree* with [~jdyer]. Until we see real demand for this feature, there is no need to pitch it. This plugin is really easy to check as a separate drop-in, but you see how many users tried to check it ... _no one_. I understand what the morphlines is, after all. It's a pretty cool lightweight transformation pipeline. but: - it doesn't have jdbc input so far (i don't think it's hard to implement it) - a pipeline implies a single chain of transformation, I don't see how to naturally join two streams of records. Regarding Flume I'm still concerned about its' minimum footprint. SQLEntityProcessor for zipper join -- Key: SOLR-4799 URL: https://issues.apache.org/jira/browse/SOLR-4799 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Mikhail Khludnev Priority: Minor Labels: dih Attachments: SOLR-4799.patch DIH is mostly considered as a playground tool, and real usages end up with SolrJ. I want to contribute few improvements target DIH performance. This one provides performant approach for joining SQL Entities with miserable memory at contrast to http://wiki.apache.org/solr/DataImportHandler#CachedSqlEntityProcessor The idea is: * parent table is explicitly ordered by it’s PK in SQL * children table is explicitly ordered by parent_id FK in SQL * children entity processor joins ordered resultsets by ‘zipper’ algorithm. Do you think it’s worth to contribute it into DIH? cc: [~goksron] [~jdyer] -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5379) Query-time multi-word synonym expansion
[ https://issues.apache.org/jira/browse/SOLR-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020221#comment-14020221 ] Jeremy Anderson commented on SOLR-5379: --- I'm in the process of trying to get this logic ported into the 4.8.1 Released Tag. I believe I've gotten the code ported over, but am having problems getting the unit test to run to confirm the correctness of the port. The main reason is the differences in the conf/solrconfig.xml and conf/schema.xml files that exist in the root and I'm guessing those used by Tien when the 4.5.0 patch was created. I'm still a SOLR novice so I'm not quite sure how to properly replicate the schema and configuration settings to get the unit test to run. I'm going to attach patch files shortly for the 4.8.1 code base along with the current stubbed out configuration files. Any help anyone can provide would be greatly appreciated. My end goal is to hopefully be able to get the multi-term synonym expansion logic to work with a 4.8.1 deployment where we're using an extended version of the SolrQueryParser. (I'm not sure if the multi-term synonym logic is only usable with this patch by the new SynonymQuotedDismaxQParser or existing DismaxQarsers). Notes on 4.8.1 port: * There is now 2 parsers usable by the FSTSynonymFilterFactory: SolrSynonymParser WordnetSynonymParser. The later of which I'm not sure if any additional logic needs to be implemented for proper usage of the tokenize parameter. * All of the logic implemented in SolrQueryParserBase from 4.5.0 has now been moved into the utility QueryBuilder class. Query-time multi-word synonym expansion --- Key: SOLR-5379 URL: https://issues.apache.org/jira/browse/SOLR-5379 Project: Solr Issue Type: Improvement Components: query parsers Reporter: Tien Nguyen Manh Labels: multi-word, queryparser, synonym Fix For: 4.9, 5.0 Attachments: quoted.patch, synonym-expander.patch While dealing with synonym at query time, solr failed to work with multi-word synonyms due to some reasons: - First the lucene queryparser tokenizes user query by space so it split multi-word term into two terms before feeding to synonym filter, so synonym filter can't recognized multi-word term to do expansion - Second, if synonym filter expand into multiple terms which contains multi-word synonym, The SolrQueryParseBase currently use MultiPhraseQuery to handle synonyms. But MultiPhraseQuery don't work with term have different number of words. For the first one, we can extend quoted all multi-word synonym in user query so that lucene queryparser don't split it. There are a jira task related to this one https://issues.apache.org/jira/browse/LUCENE-2605. For the second, we can replace MultiPhraseQuery by an appropriate BoleanQuery SHOULD which contains multiple PhraseQuery in case tokens stream have multi-word synonym. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5379) Query-time multi-word synonym expansion
[ https://issues.apache.org/jira/browse/SOLR-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020221#comment-14020221 ] Jeremy Anderson edited comment on SOLR-5379 at 6/6/14 6:55 PM: --- I'm in the process of trying to get this logic ported into the 4.8.1 Released Tag. I believe I've gotten the code ported over, but am having problems getting the unit test to run to confirm the correctness of the port. The main reason is the differences in the conf/solrconfig.xml and conf/schema.xml files that exist in the root and I'm guessing those used by Tien when the 4.5.0 patch was created. I'm still a SOLR novice so I'm not quite sure how to properly replicate the schema and configuration settings to get the unit test to run. I'm going to attach patch files shortly for the 4.8.1 code base along with the current stubbed out configuration files. Any help anyone can provide would be greatly appreciated. My end goal is to hopefully be able to get the multi-term synonym expansion logic to work with a 4.8.1 deployment where we're using an extended version of the SolrQueryParser. (I'm not sure if the multi-term synonym logic is only usable with this patch by the new SynonymQuotedDismaxQParser or existing DismaxQarsers). Notes on 4.8.1 port: * There is now 2 parsers usable by the FSTSynonymFilterFactory: SolrSynonymParser WordnetSynonymParser. The latter of which I'm not sure if any additional logic needs to be implemented for proper usage of the tokenize parameter. * All of the logic implemented in SolrQueryParserBase from 4.5.0 has now been moved into the utility QueryBuilder class. was (Author: rpialum): I'm in the process of trying to get this logic ported into the 4.8.1 Released Tag. I believe I've gotten the code ported over, but am having problems getting the unit test to run to confirm the correctness of the port. The main reason is the differences in the conf/solrconfig.xml and conf/schema.xml files that exist in the root and I'm guessing those used by Tien when the 4.5.0 patch was created. I'm still a SOLR novice so I'm not quite sure how to properly replicate the schema and configuration settings to get the unit test to run. I'm going to attach patch files shortly for the 4.8.1 code base along with the current stubbed out configuration files. Any help anyone can provide would be greatly appreciated. My end goal is to hopefully be able to get the multi-term synonym expansion logic to work with a 4.8.1 deployment where we're using an extended version of the SolrQueryParser. (I'm not sure if the multi-term synonym logic is only usable with this patch by the new SynonymQuotedDismaxQParser or existing DismaxQarsers). Notes on 4.8.1 port: * There is now 2 parsers usable by the FSTSynonymFilterFactory: SolrSynonymParser WordnetSynonymParser. The later of which I'm not sure if any additional logic needs to be implemented for proper usage of the tokenize parameter. * All of the logic implemented in SolrQueryParserBase from 4.5.0 has now been moved into the utility QueryBuilder class. Query-time multi-word synonym expansion --- Key: SOLR-5379 URL: https://issues.apache.org/jira/browse/SOLR-5379 Project: Solr Issue Type: Improvement Components: query parsers Reporter: Tien Nguyen Manh Labels: multi-word, queryparser, synonym Fix For: 4.9, 5.0 Attachments: conf-test-files-4_8_1.patch, quoted-4_8_1.patch, quoted.patch, synonym-expander-4_8_1.patch, synonym-expander.patch While dealing with synonym at query time, solr failed to work with multi-word synonyms due to some reasons: - First the lucene queryparser tokenizes user query by space so it split multi-word term into two terms before feeding to synonym filter, so synonym filter can't recognized multi-word term to do expansion - Second, if synonym filter expand into multiple terms which contains multi-word synonym, The SolrQueryParseBase currently use MultiPhraseQuery to handle synonyms. But MultiPhraseQuery don't work with term have different number of words. For the first one, we can extend quoted all multi-word synonym in user query so that lucene queryparser don't split it. There are a jira task related to this one https://issues.apache.org/jira/browse/LUCENE-2605. For the second, we can replace MultiPhraseQuery by an appropriate BoleanQuery SHOULD which contains multiple PhraseQuery in case tokens stream have multi-word synonym. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5379) Query-time multi-word synonym expansion
[ https://issues.apache.org/jira/browse/SOLR-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated SOLR-5379: -- Attachment: synonym-expander-4_8_1.patch quoted-4_8_1.patch conf-test-files-4_8_1.patch Initial files for 4.8.1 port. Unit test does not run, therefore the validity of the port is unknown. Query-time multi-word synonym expansion --- Key: SOLR-5379 URL: https://issues.apache.org/jira/browse/SOLR-5379 Project: Solr Issue Type: Improvement Components: query parsers Reporter: Tien Nguyen Manh Labels: multi-word, queryparser, synonym Fix For: 4.9, 5.0 Attachments: conf-test-files-4_8_1.patch, quoted-4_8_1.patch, quoted.patch, synonym-expander-4_8_1.patch, synonym-expander.patch While dealing with synonym at query time, solr failed to work with multi-word synonyms due to some reasons: - First the lucene queryparser tokenizes user query by space so it split multi-word term into two terms before feeding to synonym filter, so synonym filter can't recognized multi-word term to do expansion - Second, if synonym filter expand into multiple terms which contains multi-word synonym, The SolrQueryParseBase currently use MultiPhraseQuery to handle synonyms. But MultiPhraseQuery don't work with term have different number of words. For the first one, we can extend quoted all multi-word synonym in user query so that lucene queryparser don't split it. There are a jira task related to this one https://issues.apache.org/jira/browse/LUCENE-2605. For the second, we can replace MultiPhraseQuery by an appropriate BoleanQuery SHOULD which contains multiple PhraseQuery in case tokens stream have multi-word synonym. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4799) SQLEntityProcessor for zipper join
[ https://issues.apache.org/jira/browse/SOLR-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020260#comment-14020260 ] Shawn Heisey commented on SOLR-4799: bq. I think a lot of DIH users are using it to import data from an RDBMS into Solr. This is exactly what I use it for. Based on mailing list and IRC traffic, I think that most people who use DIH are using it for database import. DIH works, and it's a lot more efficient than any single-threaded program that I could write. I don't believe that it is a playground tool. Although DIH used to handle *all* our indexing, we currently only use it for full index rebuilds. A SolrJ app handles the once-a-minute maintenance. I have plans to build an internal multi-threaded SolrJ tool to handle full rebuilds, but that effort still has not made it through the design phase. Because DIH works so well, we don't have a strong need to replace it. SQLEntityProcessor for zipper join -- Key: SOLR-4799 URL: https://issues.apache.org/jira/browse/SOLR-4799 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Mikhail Khludnev Priority: Minor Labels: dih Attachments: SOLR-4799.patch DIH is mostly considered as a playground tool, and real usages end up with SolrJ. I want to contribute few improvements target DIH performance. This one provides performant approach for joining SQL Entities with miserable memory at contrast to http://wiki.apache.org/solr/DataImportHandler#CachedSqlEntityProcessor The idea is: * parent table is explicitly ordered by it’s PK in SQL * children table is explicitly ordered by parent_id FK in SQL * children entity processor joins ordered resultsets by ‘zipper’ algorithm. Do you think it’s worth to contribute it into DIH? cc: [~goksron] [~jdyer] -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4799) SQLEntityProcessor for zipper join
[ https://issues.apache.org/jira/browse/SOLR-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020207#comment-14020207 ] Mikhail Khludnev edited comment on SOLR-4799 at 6/6/14 7:26 PM: Despite of many things described above, I *agree* with [~jdyer]. Until we see real demand for this feature, there is no need to pitch it. This plugin is really easy to check as a separate drop-in, but you see how many users tried to check it ... _no one_. I understand what the morphlines is, after all. It's a pretty cool lightweight transformation pipeline. but: - it doesn't have jdbc input so far (i don't think it's hard to implement it) - a pipeline implies a single chain of transformation, I don't see how to naturally join two streams of records. Regarding Flume I'm still concerned about its' minimum footprint. fwiw here is the Kettle's approach to do the subj http://wiki.pentaho.com/display/EAI/Merge+Join was (Author: mkhludnev): Despite of many things described above, I *agree* with [~jdyer]. Until we see real demand for this feature, there is no need to pitch it. This plugin is really easy to check as a separate drop-in, but you see how many users tried to check it ... _no one_. I understand what the morphlines is, after all. It's a pretty cool lightweight transformation pipeline. but: - it doesn't have jdbc input so far (i don't think it's hard to implement it) - a pipeline implies a single chain of transformation, I don't see how to naturally join two streams of records. Regarding Flume I'm still concerned about its' minimum footprint. SQLEntityProcessor for zipper join -- Key: SOLR-4799 URL: https://issues.apache.org/jira/browse/SOLR-4799 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Mikhail Khludnev Priority: Minor Labels: dih Attachments: SOLR-4799.patch DIH is mostly considered as a playground tool, and real usages end up with SolrJ. I want to contribute few improvements target DIH performance. This one provides performant approach for joining SQL Entities with miserable memory at contrast to http://wiki.apache.org/solr/DataImportHandler#CachedSqlEntityProcessor The idea is: * parent table is explicitly ordered by it’s PK in SQL * children table is explicitly ordered by parent_id FK in SQL * children entity processor joins ordered resultsets by ‘zipper’ algorithm. Do you think it’s worth to contribute it into DIH? cc: [~goksron] [~jdyer] -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5742) size of frq record is very huge in the table
[ https://issues.apache.org/jira/browse/LUCENE-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020286#comment-14020286 ] Michael McCandless commented on LUCENE-5742: Try setting TieredMergePolicy.maxMergedSegmentMB? This will limit the max sized segment... size of frq record is very huge in the table Key: LUCENE-5742 URL: https://issues.apache.org/jira/browse/LUCENE-5742 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.2.1 Environment: Linux, Jboss, Oracle database Reporter: Maheshwara Reddy Labels: performance Hi, We are trying the use the lucene with jdbcstore, In production, the 'frq' and 'tis' record size had grown too huge, We are unable to resize the these records to smaller size eventhough we try to call reindex and optimize using different mergeFactor and maxMergeDocs -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-5699) Lucene classification score calculation normalize and return lists
[ https://issues.apache.org/jira/browse/LUCENE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tommaso Teofili reassigned LUCENE-5699: --- Assignee: Tommaso Teofili Lucene classification score calculation normalize and return lists -- Key: LUCENE-5699 URL: https://issues.apache.org/jira/browse/LUCENE-5699 Project: Lucene - Core Issue Type: Sub-task Components: modules/classification Reporter: Gergő Törcsvári Assignee: Tommaso Teofili Now the classifiers can return only the best matching classes. If somebody want it to use more complex tasks he need to modify these classes for get second and third results too. If it is possible to return a list and it is not a lot resource why we dont do that? (We iterate a list so also.) The Bayes classifier get too small return values, and there were a bug with the zero floats. It was fixed with logarithmic. It would be nice to scale the class scores sum vlue to one, and then we coud compare two documents return score and relevance. (If we dont do this the wordcount in the test documents affected the result score.) With bulletpoints: * In the Bayes classification normalized score values, and return with result lists. * In the KNN classifier possibility to return a result list. * Make the ClassificationResult Comparable for list sorting. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5285) Solr response format should support child Docs
[ https://issues.apache.org/jira/browse/SOLR-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020371#comment-14020371 ] Hoss Man commented on SOLR-5285: bq. Correct patch with all the changes. This looks solid to me ... running tests now Solr response format should support child Docs -- Key: SOLR-5285 URL: https://issues.apache.org/jira/browse/SOLR-5285 Project: Solr Issue Type: New Feature Reporter: Varun Thacker Fix For: 4.9, 5.0 Attachments: SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, javabin_backcompat_child_docs.bin Solr has added support for taking childDocs as input ( only XML till now ). It's currently used for BlockJoinQuery. I feel that if a user indexes a document with child docs, even if he isn't using the BJQ features and is just searching which results in a hit on the parentDoc, it's childDocs should be returned in the response format. [~hossman_luc...@fucit.org] on IRC suggested that the DocTransformers would be the place to add childDocs to the response. Now given a docId one needs to find out all the childDoc id's. A couple of approaches which I could think of are 1. Maintain the relation between a parentDoc and it's childDocs during indexing time in maybe a separate index? 2. Somehow emulate what happens in ToParentBlockJoinQuery.nextDoc() - Given a parentDoc it finds out all the childDocs but this requires a childScorer. Am I missing something obvious on how to find the relation between a parentDoc and it's childDocs because none of the above solutions for this look right. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6145) Concurrent Schema API field additions can result in endless loop
[ https://issues.apache.org/jira/browse/SOLR-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregory Chanan updated SOLR-6145: - Attachment: SOLR-6145.patch Here's a patch that fixes the test. The basic idea is that retrying on a BadVersionException won't help the ManagedIndexSchema, since the version will always be past our version. Instead, we just throw an exception that can be caught by the callers and retried. I only put the retry logic in FieldResource for now, but there's a few more places where it should go if we like this approach. I originally thought of doing the retry logic inside of ManagedIndexSchema, so we wouldn't need to put it in as many places, but that seemed not as clean, since the older schema would be getting the newer schema (presumably from the core), which seems wrong since a schema should exist independently of a core. Concurrent Schema API field additions can result in endless loop Key: SOLR-6145 URL: https://issues.apache.org/jira/browse/SOLR-6145 Project: Solr Issue Type: Bug Components: Schema and Analysis Reporter: Steve Rowe Assignee: Steve Rowe Priority: Critical Attachments: SOLR-6145.patch, concurrent_updates_and_schema_api.patch The optimistic concurrency loop in {{ManagedIndexSchema.addFields()}} is the likely culprit. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #640: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/640/ No tests ran. Build Log: [...truncated 9684 lines...] BUILD FAILED /home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:483: The following error occurred while executing this line: /home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:104: The following error occurred while executing this line: /home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-4.x/lucene/build.xml:194: The following error occurred while executing this line: /home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-4.x/lucene/common-build.xml:401: The following error occurred while executing this line: /home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-4.x/lucene/common-build.xml:438: Ivy is not available Total time: 6 seconds Build step 'Invoke Ant' marked build as failure Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #640: POMs out of sync
All sorts of wonkiness here... #1) the job got run on ubuntu6 instead of on our lucene zone (which is why ivy wasn't found) #2) the lucene zone machine appears to be down #3) the reason the job got run on ubuntu6 seems to be because abayer edited the job config to saw it could run on either one... https://builds.apache.org/job/Lucene-Solr-Maven-4.x/jobConfigHistory/showDiffFiles?timestamp1=2014-05-09_16-37-34timestamp2=2014-06-06_21-44-58 (Note: that diff URL will only work if you have jenkins web auth to login) No idea who abayer is (is that someone on infra?) or why he's editing the job config. : Date: Fri, 6 Jun 2014 21:47:33 + (UTC) : From: Apache Jenkins Server jenk...@builds.apache.org : Reply-To: dev@lucene.apache.org : To: dev@lucene.apache.org : Subject: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #640: POMs out of sync : : Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/640/ : : No tests ran. : : Build Log: : [...truncated 9684 lines...] : BUILD FAILED : /home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:483: The following error occurred while executing this line: : /home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:104: The following error occurred while executing this line: : /home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-4.x/lucene/build.xml:194: The following error occurred while executing this line: : /home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-4.x/lucene/common-build.xml:401: The following error occurred while executing this line: : /home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-4.x/lucene/common-build.xml:438: Ivy is not available : : Total time: 6 seconds : Build step 'Invoke Ant' marked build as failure : Recording test results : Email was triggered for: Failure : Sending email for trigger: Failure : : : -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6148) Failure on Jenkins: leaked parallelCoreAdminExecutor threads
Anshum Gupta created SOLR-6148: -- Summary: Failure on Jenkins: leaked parallelCoreAdminExecutor threads Key: SOLR-6148 URL: https://issues.apache.org/jira/browse/SOLR-6148 Project: Solr Issue Type: Bug Reporter: Anshum Gupta Assignee: Anshum Gupta Investigate and provide a fix for MultiThreadedOCPTest failures due to leaked parallelCoreAdminExecutor threads. http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10490/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5285) Solr response format should support child Docs
[ https://issues.apache.org/jira/browse/SOLR-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020487#comment-14020487 ] ASF subversion and git services commented on SOLR-5285: --- Commit 1601028 from hoss...@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1601028 ] SOLR-5285: Added a new [child ...] DocTransformer for optionally including Block-Join decendent documents inline in the results of a search Solr response format should support child Docs -- Key: SOLR-5285 URL: https://issues.apache.org/jira/browse/SOLR-5285 Project: Solr Issue Type: New Feature Reporter: Varun Thacker Fix For: 4.9, 5.0 Attachments: SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, javabin_backcompat_child_docs.bin Solr has added support for taking childDocs as input ( only XML till now ). It's currently used for BlockJoinQuery. I feel that if a user indexes a document with child docs, even if he isn't using the BJQ features and is just searching which results in a hit on the parentDoc, it's childDocs should be returned in the response format. [~hossman_luc...@fucit.org] on IRC suggested that the DocTransformers would be the place to add childDocs to the response. Now given a docId one needs to find out all the childDoc id's. A couple of approaches which I could think of are 1. Maintain the relation between a parentDoc and it's childDocs during indexing time in maybe a separate index? 2. Somehow emulate what happens in ToParentBlockJoinQuery.nextDoc() - Given a parentDoc it finds out all the childDocs but this requires a childScorer. Am I missing something obvious on how to find the relation between a parentDoc and it's childDocs because none of the above solutions for this look right. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #640: POMs out of sync
abayer = Andrew Bayer I chatted with JanIV = Jan Iversen, and abayer on #asfinfra today about the issue - here’s a transcript: 5:58:06 PM sarowe Hi, I'm trying to ssh into lucene.zones.apache.org but it times out - is this known/expected? 5:59:01 PM sarowe (I do see a bunch of red zones alerts on status.apache.org - maybe ssh://hudson-fbsd.zones?) 6:17:28 PM sarowe Hello? anybody there who can tell me about the lucene jenkins build slave outage? I see at least one lucene job's configuration was altered today to have it run on ubuntu in addition to lucene. 6:18:10 PM sarowe there's no way it will run there without modifications, btw 6:22:24 PM janIV sarowe: the bunch you see right now is because our staff is fighting hard with upgrades to overcome a security issue. 6:22:44 PM janIV sarowe: I recommend you check it again tomorrow. 6:23:49 PM janIV sarowe: abayer is working hard on getting the jenkins master to work effeciently, maybe thats the cause of the change. 6:24:33 PM abayer sarowe, janIV - that was me, testing to see if it'd run alright elsewhere. 6:24:39 PM sarowe JanIV: makes sense, it was abayer who changed the lucene job configs 6:24:41 PM abayer My bad. 6:25:08 PM abayer Is there a doc somewhere on what's needed for the lucene build env? If so, I'll try to make sure that future general slaves can handle it... 6:25:13 PM sarowe abayer: lucene's jobs are tied to local config on the lucene slave 6:25:37 PM sarowe abayer: the guy who maintains that (Uwe Schindler) doesn't seem to be online ATM 6:25:44 PM janIV abayer: not your bad ! you need to test to make things better, thats quite ok. but maybe tell the affected people. 6:26:36 PM abayer sarowe: Ok, I'll email him later - I'd chatted a bit with Mark Miller about it internally here at Cloudera and he didn't think there was anything specific to the slave, but no biggie - I just got sad seeing so many queued up jobs.  6:26:57 PM sarowe abayer: just to be sure: lucene heavily uses the dedicated lucene slave - it's going to remain available, isn't it? 6:27:26 PM abayer sarowe: I hope so! I'm not planning to get rid of it, and, well, I can't get rid of it.  6:27:46 PM abayer Nor would I want to - I just want to complement it/supplement it/ get a better box for it/etc. 6:28:20 PM sarowe that'd be great - lucene could use more machine resources for its jobs, for sure 6:29:44 PM janIV abayer: I hope all your changes are documented in some form, not the tests, but the outcome. 6:29:49 PM abayer In any case, my apologies for the misconfig, and yeah, not sure who you need to talk to about lucene.zones - it's been down since yesterday, I think, but I don't know who's involved in getting it up. 6:29:55 PM abayer janIV: Gradually, yes.  6:30:20 PM abayer I've got an evernote note chock full of crap that I'll pull together into something coherent once I feel like we're really at a steady state - next week most likely. 6:30:23 PM sarowe abayer, sure, no problem, thanks for the info 6:31:12 PM janIV abayer: make a jenkins file in trunk/docs then we can all share it. you have commit karma there. 6:31:17 PM abayer yarp yarp On Jun 6, 2014, at 6:11 PM, Chris Hostetter hossman_luc...@fucit.org wrote: All sorts of wonkiness here... #1) the job got run on ubuntu6 instead of on our lucene zone (which is why ivy wasn't found) #2) the lucene zone machine appears to be down #3) the reason the job got run on ubuntu6 seems to be because abayer edited the job config to saw it could run on either one... https://builds.apache.org/job/Lucene-Solr-Maven-4.x/jobConfigHistory/showDiffFiles?timestamp1=2014-05-09_16-37-34timestamp2=2014-06-06_21-44-58 (Note: that diff URL will only work if you have jenkins web auth to login) No idea who abayer is (is that someone on infra?) or why he's editing the job config. : Date: Fri, 6 Jun 2014 21:47:33 + (UTC) : From: Apache Jenkins Server jenk...@builds.apache.org : Reply-To: dev@lucene.apache.org : To: dev@lucene.apache.org : Subject: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #640: POMs out of sync : : Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/640/ : : No tests ran. : : Build Log: : [...truncated 9684 lines...] : BUILD FAILED : /home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:483: The following error occurred while executing this line: : /home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:104: The following error occurred while
[jira] [Updated] (SOLR-6148) Failure on Jenkins: leaked parallelCoreAdminExecutor threads
[ https://issues.apache.org/jira/browse/SOLR-6148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anshum Gupta updated SOLR-6148: --- Attachment: SOLR-6148.patch The parallelExecutor in CoreAdminHandler is no longer lazy init'ed as it anyways wasn't doing any good. Failure on Jenkins: leaked parallelCoreAdminExecutor threads Key: SOLR-6148 URL: https://issues.apache.org/jira/browse/SOLR-6148 Project: Solr Issue Type: Bug Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-6148.patch Investigate and provide a fix for MultiThreadedOCPTest failures due to leaked parallelCoreAdminExecutor threads. http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10490/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5285) Solr response format should support child Docs
[ https://issues.apache.org/jira/browse/SOLR-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020521#comment-14020521 ] Hoss Man commented on SOLR-5285: I've backported to 4x and am running the tests now -- but while backporting i noticed something i overlooked when reviewing the patch: the ChildDocTransformerFactory usage of FieldType.toExternal FieldType.getFieldQuery from my May 21th patch some how got lost along the way, so this is still brittle in how it deals with the primaryKey field. once i've finished backporting the current trunk state to 4x, i'll try to fix that before resolving this issue. Solr response format should support child Docs -- Key: SOLR-5285 URL: https://issues.apache.org/jira/browse/SOLR-5285 Project: Solr Issue Type: New Feature Reporter: Varun Thacker Fix For: 4.9, 5.0 Attachments: SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, javabin_backcompat_child_docs.bin Solr has added support for taking childDocs as input ( only XML till now ). It's currently used for BlockJoinQuery. I feel that if a user indexes a document with child docs, even if he isn't using the BJQ features and is just searching which results in a hit on the parentDoc, it's childDocs should be returned in the response format. [~hossman_luc...@fucit.org] on IRC suggested that the DocTransformers would be the place to add childDocs to the response. Now given a docId one needs to find out all the childDoc id's. A couple of approaches which I could think of are 1. Maintain the relation between a parentDoc and it's childDocs during indexing time in maybe a separate index? 2. Somehow emulate what happens in ToParentBlockJoinQuery.nextDoc() - Given a parentDoc it finds out all the childDocs but this requires a childScorer. Am I missing something obvious on how to find the relation between a parentDoc and it's childDocs because none of the above solutions for this look right. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5285) Solr response format should support child Docs
[ https://issues.apache.org/jira/browse/SOLR-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020542#comment-14020542 ] ASF subversion and git services commented on SOLR-5285: --- Commit 1601037 from hoss...@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1601037 ] SOLR-5285: Added a new [child ...] DocTransformer for optionally including Block-Join decendent documents inline in the results of a search (merge r1601028) Solr response format should support child Docs -- Key: SOLR-5285 URL: https://issues.apache.org/jira/browse/SOLR-5285 Project: Solr Issue Type: New Feature Reporter: Varun Thacker Fix For: 4.9, 5.0 Attachments: SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, javabin_backcompat_child_docs.bin Solr has added support for taking childDocs as input ( only XML till now ). It's currently used for BlockJoinQuery. I feel that if a user indexes a document with child docs, even if he isn't using the BJQ features and is just searching which results in a hit on the parentDoc, it's childDocs should be returned in the response format. [~hossman_luc...@fucit.org] on IRC suggested that the DocTransformers would be the place to add childDocs to the response. Now given a docId one needs to find out all the childDoc id's. A couple of approaches which I could think of are 1. Maintain the relation between a parentDoc and it's childDocs during indexing time in maybe a separate index? 2. Somehow emulate what happens in ToParentBlockJoinQuery.nextDoc() - Given a parentDoc it finds out all the childDocs but this requires a childScorer. Am I missing something obvious on how to find the relation between a parentDoc and it's childDocs because none of the above solutions for this look right. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6148) Failure on Jenkins: leaked parallelCoreAdminExecutor threads
[ https://issues.apache.org/jira/browse/SOLR-6148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020544#comment-14020544 ] ASF subversion and git services commented on SOLR-6148: --- Commit 1601038 from [~anshumg] in branch 'dev/trunk' [ https://svn.apache.org/r1601038 ] SOLR-6148: Trying to fix Jenkins failures by not LazyLoading the ParallelExecutor in CoreAdminHandler Failure on Jenkins: leaked parallelCoreAdminExecutor threads Key: SOLR-6148 URL: https://issues.apache.org/jira/browse/SOLR-6148 Project: Solr Issue Type: Bug Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-6148.patch Investigate and provide a fix for MultiThreadedOCPTest failures due to leaked parallelCoreAdminExecutor threads. http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10490/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #640: POMs out of sync
More info from gmcdonald on #asfinfra: 6:48:33 PM sarowe JanIV: do you know who's responsible for the lucene (and other) zones? 7:40:20 PM gmcdonald sarowe1: its a jail on a downed host, known issue, will look at it in the morning if noone else gets there first 7:40:45 PM sarowe1 thanks gmcdonald 7:40:54 PM gmcdonald np On Jun 6, 2014, at 6:48 PM, Steve Rowe sar...@gmail.com wrote: abayer = Andrew Bayer I chatted with JanIV = Jan Iversen, and abayer on #asfinfra today about the issue - here’s a transcript: 5:58:06 PM sarowe Hi, I'm trying to ssh into lucene.zones.apache.org but it times out - is this known/expected? 5:59:01 PM sarowe (I do see a bunch of red zones alerts on status.apache.org - maybe ssh://hudson-fbsd.zones?) 6:17:28 PM sarowe Hello? anybody there who can tell me about the lucene jenkins build slave outage? I see at least one lucene job's configuration was altered today to have it run on ubuntu in addition to lucene. 6:18:10 PM sarowe there's no way it will run there without modifications, btw 6:22:24 PM janIV sarowe: the bunch you see right now is because our staff is fighting hard with upgrades to overcome a security issue. 6:22:44 PM janIV sarowe: I recommend you check it again tomorrow. 6:23:49 PM janIV sarowe: abayer is working hard on getting the jenkins master to work effeciently, maybe thats the cause of the change. 6:24:33 PM abayer sarowe, janIV - that was me, testing to see if it'd run alright elsewhere. 6:24:39 PM sarowe JanIV: makes sense, it was abayer who changed the lucene job configs 6:24:41 PM abayer My bad. 6:25:08 PM abayer Is there a doc somewhere on what's needed for the lucene build env? If so, I'll try to make sure that future general slaves can handle it... 6:25:13 PM sarowe abayer: lucene's jobs are tied to local config on the lucene slave 6:25:37 PM sarowe abayer: the guy who maintains that (Uwe Schindler) doesn't seem to be online ATM 6:25:44 PM janIV abayer: not your bad ! you need to test to make things better, thats quite ok. but maybe tell the affected people. 6:26:36 PM abayer sarowe: Ok, I'll email him later - I'd chatted a bit with Mark Miller about it internally here at Cloudera and he didn't think there was anything specific to the slave, but no biggie - I just got sad seeing so many queued up jobs.  6:26:57 PM sarowe abayer: just to be sure: lucene heavily uses the dedicated lucene slave - it's going to remain available, isn't it? 6:27:26 PM abayer sarowe: I hope so! I'm not planning to get rid of it, and, well, I can't get rid of it.  6:27:46 PM abayer Nor would I want to - I just want to complement it/supplement it/ get a better box for it/etc. 6:28:20 PM sarowe that'd be great - lucene could use more machine resources for its jobs, for sure 6:29:44 PM janIV abayer: I hope all your changes are documented in some form, not the tests, but the outcome. 6:29:49 PM abayer In any case, my apologies for the misconfig, and yeah, not sure who you need to talk to about lucene.zones - it's been down since yesterday, I think, but I don't know who's involved in getting it up. 6:29:55 PM abayer janIV: Gradually, yes.  6:30:20 PM abayer I've got an evernote note chock full of crap that I'll pull together into something coherent once I feel like we're really at a steady state - next week most likely. 6:30:23 PM sarowe abayer, sure, no problem, thanks for the info 6:31:12 PM janIV abayer: make a jenkins file in trunk/docs then we can all share it. you have commit karma there. 6:31:17 PM abayer yarp yarp On Jun 6, 2014, at 6:11 PM, Chris Hostetter hossman_luc...@fucit.org wrote: All sorts of wonkiness here... #1) the job got run on ubuntu6 instead of on our lucene zone (which is why ivy wasn't found) #2) the lucene zone machine appears to be down #3) the reason the job got run on ubuntu6 seems to be because abayer edited the job config to saw it could run on either one... https://builds.apache.org/job/Lucene-Solr-Maven-4.x/jobConfigHistory/showDiffFiles?timestamp1=2014-05-09_16-37-34timestamp2=2014-06-06_21-44-58 (Note: that diff URL will only work if you have jenkins web auth to login) No idea who abayer is (is that someone on infra?) or why he's editing the job config. : Date: Fri, 6 Jun 2014 21:47:33 + (UTC) : From: Apache Jenkins Server jenk...@builds.apache.org : Reply-To: dev@lucene.apache.org : To: dev@lucene.apache.org :
[jira] [Commented] (SOLR-6148) Failure on Jenkins: leaked parallelCoreAdminExecutor threads
[ https://issues.apache.org/jira/browse/SOLR-6148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020581#comment-14020581 ] ASF subversion and git services commented on SOLR-6148: --- Commit 1601043 from [~anshumg] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1601043 ] SOLR-6148: Trying to fix Jenkins failures by not LazyLoading the ParallelExecutor in CoreAdminHandler (Merge from trunk r1601038) Failure on Jenkins: leaked parallelCoreAdminExecutor threads Key: SOLR-6148 URL: https://issues.apache.org/jira/browse/SOLR-6148 Project: Solr Issue Type: Bug Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-6148.patch Investigate and provide a fix for MultiThreadedOCPTest failures due to leaked parallelCoreAdminExecutor threads. http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10490/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6148) Failure on Jenkins: leaked parallelCoreAdminExecutor threads
[ https://issues.apache.org/jira/browse/SOLR-6148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020582#comment-14020582 ] Anshum Gupta commented on SOLR-6148: Leaving this open for a day. If there are no more similar failures, will mark this as resolved. Failure on Jenkins: leaked parallelCoreAdminExecutor threads Key: SOLR-6148 URL: https://issues.apache.org/jira/browse/SOLR-6148 Project: Solr Issue Type: Bug Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-6148.patch Investigate and provide a fix for MultiThreadedOCPTest failures due to leaked parallelCoreAdminExecutor threads. http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10490/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1601038 - /lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/admin/CoreAdm inHandler.java
: @@ -119,6 +119,8 @@ public class CoreAdminHandler extends Re ... : @@ -129,6 +131,8 @@ public class CoreAdminHandler extends Re Doing the same init in 2 places seems prone to future errors -- why not just inline it with the declaration of the parallelExecutor (and make it final while we're at it) -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1601038 - /lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/admin/CoreAdm inHandler.java
The second ctor is only used by a mock, which I didn't look into much so just added it to both the places. I'll just make it final and inline it. On Fri, Jun 6, 2014 at 5:28 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : @@ -119,6 +119,8 @@ public class CoreAdminHandler extends Re ... : @@ -129,6 +131,8 @@ public class CoreAdminHandler extends Re Doing the same init in 2 places seems prone to future errors -- why not just inline it with the declaration of the parallelExecutor (and make it final while we're at it) -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Anshum Gupta http://www.anshumgupta.net - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5285) Solr response format should support child Docs
[ https://issues.apache.org/jira/browse/SOLR-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020596#comment-14020596 ] ASF subversion and git services commented on SOLR-5285: --- Commit 1601044 from hoss...@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1601044 ] SOLR-5285: use FieldType methods to be less brittle Solr response format should support child Docs -- Key: SOLR-5285 URL: https://issues.apache.org/jira/browse/SOLR-5285 Project: Solr Issue Type: New Feature Reporter: Varun Thacker Fix For: 4.9, 5.0 Attachments: SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, javabin_backcompat_child_docs.bin Solr has added support for taking childDocs as input ( only XML till now ). It's currently used for BlockJoinQuery. I feel that if a user indexes a document with child docs, even if he isn't using the BJQ features and is just searching which results in a hit on the parentDoc, it's childDocs should be returned in the response format. [~hossman_luc...@fucit.org] on IRC suggested that the DocTransformers would be the place to add childDocs to the response. Now given a docId one needs to find out all the childDoc id's. A couple of approaches which I could think of are 1. Maintain the relation between a parentDoc and it's childDocs during indexing time in maybe a separate index? 2. Somehow emulate what happens in ToParentBlockJoinQuery.nextDoc() - Given a parentDoc it finds out all the childDocs but this requires a childScorer. Am I missing something obvious on how to find the relation between a parentDoc and it's childDocs because none of the above solutions for this look right. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6148) Failure on Jenkins: leaked parallelCoreAdminExecutor threads
[ https://issues.apache.org/jira/browse/SOLR-6148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020601#comment-14020601 ] ASF subversion and git services commented on SOLR-6148: --- Commit 1601047 from [~anshumg] in branch 'dev/trunk' [ https://svn.apache.org/r1601047 ] SOLR-6148: Trying to fix Jenkins failures by not LazyLoading the ParallelExecutor in CoreAdminHandler Failure on Jenkins: leaked parallelCoreAdminExecutor threads Key: SOLR-6148 URL: https://issues.apache.org/jira/browse/SOLR-6148 Project: Solr Issue Type: Bug Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-6148.patch Investigate and provide a fix for MultiThreadedOCPTest failures due to leaked parallelCoreAdminExecutor threads. http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10490/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6148) Failure on Jenkins: leaked parallelCoreAdminExecutor threads
[ https://issues.apache.org/jira/browse/SOLR-6148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020602#comment-14020602 ] ASF subversion and git services commented on SOLR-6148: --- Commit 1601048 from [~anshumg] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1601048 ] SOLR-6148: Trying to fix Jenkins failures by not LazyLoading the ParallelExecutor in CoreAdminHandler (Merge from trunk r1601047) Failure on Jenkins: leaked parallelCoreAdminExecutor threads Key: SOLR-6148 URL: https://issues.apache.org/jira/browse/SOLR-6148 Project: Solr Issue Type: Bug Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-6148.patch Investigate and provide a fix for MultiThreadedOCPTest failures due to leaked parallelCoreAdminExecutor threads. http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10490/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5285) Solr response format should support child Docs
[ https://issues.apache.org/jira/browse/SOLR-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020638#comment-14020638 ] ASF subversion and git services commented on SOLR-5285: --- Commit 1601052 from hoss...@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1601052 ] SOLR-5285: use FieldType methods to be less brittle (merge r1601044) Solr response format should support child Docs -- Key: SOLR-5285 URL: https://issues.apache.org/jira/browse/SOLR-5285 Project: Solr Issue Type: New Feature Reporter: Varun Thacker Fix For: 4.9, 5.0 Attachments: SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, javabin_backcompat_child_docs.bin Solr has added support for taking childDocs as input ( only XML till now ). It's currently used for BlockJoinQuery. I feel that if a user indexes a document with child docs, even if he isn't using the BJQ features and is just searching which results in a hit on the parentDoc, it's childDocs should be returned in the response format. [~hossman_luc...@fucit.org] on IRC suggested that the DocTransformers would be the place to add childDocs to the response. Now given a docId one needs to find out all the childDoc id's. A couple of approaches which I could think of are 1. Maintain the relation between a parentDoc and it's childDocs during indexing time in maybe a separate index? 2. Somehow emulate what happens in ToParentBlockJoinQuery.nextDoc() - Given a parentDoc it finds out all the childDocs but this requires a childScorer. Am I missing something obvious on how to find the relation between a parentDoc and it's childDocs because none of the above solutions for this look right. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5285) Solr response format should support child Docs
[ https://issues.apache.org/jira/browse/SOLR-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-5285. Resolution: Fixed Assignee: Hoss Man Solr response format should support child Docs -- Key: SOLR-5285 URL: https://issues.apache.org/jira/browse/SOLR-5285 Project: Solr Issue Type: New Feature Reporter: Varun Thacker Assignee: Hoss Man Fix For: 4.9, 5.0 Attachments: SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, javabin_backcompat_child_docs.bin Solr has added support for taking childDocs as input ( only XML till now ). It's currently used for BlockJoinQuery. I feel that if a user indexes a document with child docs, even if he isn't using the BJQ features and is just searching which results in a hit on the parentDoc, it's childDocs should be returned in the response format. [~hossman_luc...@fucit.org] on IRC suggested that the DocTransformers would be the place to add childDocs to the response. Now given a docId one needs to find out all the childDoc id's. A couple of approaches which I could think of are 1. Maintain the relation between a parentDoc and it's childDocs during indexing time in maybe a separate index? 2. Somehow emulate what happens in ToParentBlockJoinQuery.nextDoc() - Given a parentDoc it finds out all the childDocs but this requires a childScorer. Am I missing something obvious on how to find the relation between a parentDoc and it's childDocs because none of the above solutions for this look right. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6145) Concurrent Schema API field additions can result in endless loop
[ https://issues.apache.org/jira/browse/SOLR-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregory Chanan updated SOLR-6145: - Attachment: SOLR-6145v2.patch SOLR-6145-tests.patch Two patches: The tests patch is just taking the existing test as a template and testing copyFields and adding fields via POST. I verified that these tests fail consistently without the patch. The v2 patch handles the copyfields and adding fields via POST cases. The POST case has a bit of extra logic for making the optimistic concurrency control failure look like what would happen if the concurrent schema changes were actually serialized. Concurrent Schema API field additions can result in endless loop Key: SOLR-6145 URL: https://issues.apache.org/jira/browse/SOLR-6145 Project: Solr Issue Type: Bug Components: Schema and Analysis Reporter: Steve Rowe Assignee: Steve Rowe Priority: Critical Attachments: SOLR-6145-tests.patch, SOLR-6145.patch, SOLR-6145v2.patch, concurrent_updates_and_schema_api.patch The optimistic concurrency loop in {{ManagedIndexSchema.addFields()}} is the likely culprit. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6088) Add query re-ranking with the ReRankingQParserPlugin
[ https://issues.apache.org/jira/browse/SOLR-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein resolved SOLR-6088. -- Resolution: Fixed Add query re-ranking with the ReRankingQParserPlugin Key: SOLR-6088 URL: https://issues.apache.org/jira/browse/SOLR-6088 Project: Solr Issue Type: New Feature Components: search Reporter: Joel Bernstein Assignee: Joel Bernstein Fix For: 4.9 Attachments: SOLR-6088.patch, SOLR-6088.patch, SOLR-6088.patch, SOLR-6088.patch, SOLR-6088.patch This ticket introduces the ReRankingQParserPlugin which adds query Reranking/Rescoring for Solr. It leverages the new RankQuery framework to plug-in the new Lucene QueryRescorer. See ticket LUCENE-5489 for details on the use case. Sample syntax: {code} q=*:*rq={!rerank reRankQuery=$rqq reRankDocs=200 reRankWeight=3} {code} In the example above the mainQuery is executed and 200 docs are collected and re-ranked based on the results of the reRankQuery. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-6088) Add query re-ranking with the ReRankingQParserPlugin
[ https://issues.apache.org/jira/browse/SOLR-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein reassigned SOLR-6088: Assignee: Joel Bernstein Add query re-ranking with the ReRankingQParserPlugin Key: SOLR-6088 URL: https://issues.apache.org/jira/browse/SOLR-6088 Project: Solr Issue Type: New Feature Components: search Reporter: Joel Bernstein Assignee: Joel Bernstein Fix For: 4.9 Attachments: SOLR-6088.patch, SOLR-6088.patch, SOLR-6088.patch, SOLR-6088.patch, SOLR-6088.patch This ticket introduces the ReRankingQParserPlugin which adds query Reranking/Rescoring for Solr. It leverages the new RankQuery framework to plug-in the new Lucene QueryRescorer. See ticket LUCENE-5489 for details on the use case. Sample syntax: {code} q=*:*rq={!rerank reRankQuery=$rqq reRankDocs=200 reRankWeight=3} {code} In the example above the mainQuery is executed and 200 docs are collected and re-ranked based on the results of the reRankQuery. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5973) Pluggable Ranking Collectors and Merge Strategies
[ https://issues.apache.org/jira/browse/SOLR-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein resolved SOLR-5973. -- Resolution: Fixed Pluggable Ranking Collectors and Merge Strategies - Key: SOLR-5973 URL: https://issues.apache.org/jira/browse/SOLR-5973 Project: Solr Issue Type: New Feature Components: search Reporter: Joel Bernstein Assignee: Joel Bernstein Priority: Minor Fix For: 4.9 Attachments: SOLR-5973.patch, SOLR-5973.patch, SOLR-5973.patch, SOLR-5973.patch, SOLR-5973.patch, SOLR-5973.patch, SOLR-5973.patch, SOLR-5973.patch, SOLR-5973.patch, SOLR-5973.patch, SOLR-5973.patch, SOLR-5973.patch, SOLR-5973.patch, SOLR-5973.patch, SOLR-5973.patch, SOLR-5973.patch This ticket introduces a new RankQuery and MergeStrategy to Solr. By extending the RankQuery class, and implementing it's interface, you can specify a custom ranking collector (TopDocsCollector) and distributed merge strategy for a Solr query. Sample syntax: {code} q=hellorq={!customRank param1=a param2=b}wt=jsonindent=true {code} In the sample above the new rq (rank query) param: {code}rq={!customRank param1=a param2=b}{code} points to a QParserPlugin that returns a Query that extends RankQuery. The RankQuery defines the custom ranking and merge strategy for the main query. The RankQuery impl will have to do several things: 1) Implement the getTopDocsCollector() method to return a custom top docs ranking collector. 2) Implement the wrap() method. The QueryComponent calls the wrap() method to wrap the RankQuery around the main query. This design allows the RankQuery to manage Query caching issues and implement custom Query explanations if needed. 3) Implement hashCode() and equals() so the queryResultCache works properly with main query and custom ranking algorithm. 4) Optionally implement a custom MergeStrategy to handle the merging of distributed results from the shards. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org