[jira] [Commented] (LUCENE-8833) Allow subclasses of MMapDirecory to preload individual IndexInputs
[ https://issues.apache.org/jira/browse/LUCENE-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858441#comment-16858441 ] Simon Willnauer commented on LUCENE-8833: - I do like the idea of #warm but the footprint is much bigger since it's a public API. I mean for my specific usecase I'd subclass mmap anyway and it would make it easier that way. FileSwitchDirectory is quite heavy and isn't really build for what I wanna do. I basically would need a IndexInput factory that I can plug into a directory that can alternate between NIOFS and mmap etc. and conditionally preload the mmap. Either way I can work with both I just think this change is the minimum viable change. lemme know if you are ok moving forward. > Allow subclasses of MMapDirecory to preload individual IndexInputs > -- > > Key: LUCENE-8833 > URL: https://issues.apache.org/jira/browse/LUCENE-8833 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > I think it's useful for subclasses to select the preload flag on a per index > input basis rather than all or nothing. Here is a patch that has an > overloaded protected openInput method. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8833) Allow subclasses of MMapDirecory to preload individual IndexInputs
[ https://issues.apache.org/jira/browse/LUCENE-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858126#comment-16858126 ] Robert Muir commented on LUCENE-8833: - Another idea is to expose the option completely differently to make it easier for the search use-case, maybe such as {{IndexInput.warm()}}. MMapDirectory could call {{load()}} on relevant bytebuffers, NIOFSDirectory could do whatever, ByteBuffersDirectory could do nothing. Someone could use this in their IndexReaderWarmer to efficiently warm their index and reduce user latency. > Allow subclasses of MMapDirecory to preload individual IndexInputs > -- > > Key: LUCENE-8833 > URL: https://issues.apache.org/jira/browse/LUCENE-8833 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > I think it's useful for subclasses to select the preload flag on a per index > input basis rather than all or nothing. Here is a patch that has an > overloaded protected openInput method. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8833) Allow subclasses of MMapDirecory to preload individual IndexInputs
[ https://issues.apache.org/jira/browse/LUCENE-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858120#comment-16858120 ] Robert Muir commented on LUCENE-8833: - I'm just curious about more details. For the merge use-case, it makes sense to hint the operating system to do some read-ahead, since the bits will be accessed sequentially. But this flag does a lot more than that, it will touch every page too. It's too bad java has no other way to hit up madvise. :) And isn't it the case the indexinput will already be open by IndexWriter? So I've always been confused about the IOContext ctor for that reason. It almost seems like {{clone(IOContext)}} would be more useful, we could practically do something there. Anyway, just some concerns about exposing too much of this flag. Even if you can choose it based on arbitrary complex-logic, it would still be an all-or-nothing "hammer" because of how java limits us as far as telling the OS our intentions. The current IOContext is not really utilized much, because I think the problem is hard. > Allow subclasses of MMapDirecory to preload individual IndexInputs > -- > > Key: LUCENE-8833 > URL: https://issues.apache.org/jira/browse/LUCENE-8833 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > I think it's useful for subclasses to select the preload flag on a per index > input basis rather than all or nothing. Here is a patch that has an > overloaded protected openInput method. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8833) Allow subclasses of MMapDirecory to preload individual IndexInputs
[ https://issues.apache.org/jira/browse/LUCENE-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16857525#comment-16857525 ] Simon Willnauer commented on LUCENE-8833: - > what would the iocontext provide to base the preload decision on? just > curious. sure, the one I had in mind as an example is merge. I am not sure if it makes a big difference I was just thinking if there are other signals than the file extension. I opened LUCENE-8835 to fix the file listing issue FileSwitchDirectory has. > Allow subclasses of MMapDirecory to preload individual IndexInputs > -- > > Key: LUCENE-8833 > URL: https://issues.apache.org/jira/browse/LUCENE-8833 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > I think it's useful for subclasses to select the preload flag on a per index > input basis rather than all or nothing. Here is a patch that has an > overloaded protected openInput method. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8833) Allow subclasses of MMapDirecory to preload individual IndexInputs
[ https://issues.apache.org/jira/browse/LUCENE-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16856997#comment-16856997 ] Robert Muir commented on LUCENE-8833: - what would the iocontext provide to base the preload decision on? just curious. > Allow subclasses of MMapDirecory to preload individual IndexInputs > -- > > Key: LUCENE-8833 > URL: https://issues.apache.org/jira/browse/LUCENE-8833 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > I think it's useful for subclasses to select the preload flag on a per index > input basis rather than all or nothing. Here is a patch that has an > overloaded protected openInput method. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8833) Allow subclasses of MMapDirecory to preload individual IndexInputs
[ https://issues.apache.org/jira/browse/LUCENE-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16856781#comment-16856781 ] Simon Willnauer commented on LUCENE-8833: - you are correct that's what elasticsearch does. Yet, FileSwitchDirectory had many issues in the past and still has (I am working on one issue related to [this|https://github.com/elastic/elasticsearch/pull/37140] and will open another issue soon. Especially with the push of pending deletes down to FSDirectory things became more tricky for FileSwitchDirectory especially. That said I think these issue should be fixed and I will work on it it was more of a trigger to look closer. I also wanted to make decisions if you preload or not based on the IOContext down the road which FileSwitch would not be capable of doing in this context. I hope this makes sense? > Allow subclasses of MMapDirecory to preload individual IndexInputs > -- > > Key: LUCENE-8833 > URL: https://issues.apache.org/jira/browse/LUCENE-8833 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > I think it's useful for subclasses to select the preload flag on a per index > input basis rather than all or nothing. Here is a patch that has an > overloaded protected openInput method. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8833) Allow subclasses of MMapDirecory to preload individual IndexInputs
[ https://issues.apache.org/jira/browse/LUCENE-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16856751#comment-16856751 ] Robert Muir commented on LUCENE-8833: - Can't the user do this with FileSwitchDirectory today? I am just mentioning that, because its what I had in mind when adding this option, its not really "all or nothing". That being said I'm not necessarily opposed to the change, but there is the general issue of whether or not we want to support users subclassing stuff like MMapDirectory versus using tools like FilterDirectory/FileSwitchDirectory to guide behavior like this. > Allow subclasses of MMapDirecory to preload individual IndexInputs > -- > > Key: LUCENE-8833 > URL: https://issues.apache.org/jira/browse/LUCENE-8833 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Simon Willnauer >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > I think it's useful for subclasses to select the preload flag on a per index > input basis rather than all or nothing. Here is a patch that has an > overloaded protected openInput method. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org