[jira] [Commented] (LUCENE-8833) Allow subclasses of MMapDirecory to preload individual IndexInputs

2019-06-07 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858441#comment-16858441
 ] 

Simon Willnauer commented on LUCENE-8833:
-

I do like the idea of #warm but the footprint is much bigger since it's a 
public API. I mean for my specific usecase I'd subclass mmap anyway and it 
would make it easier that way. FileSwitchDirectory is quite heavy and isn't 
really build for what I wanna do. I basically would need a IndexInput factory 
that I can plug into a directory that can alternate between NIOFS and mmap etc. 
and conditionally preload the mmap. Either way I can work with both I just 
think this change is the minimum viable change. lemme know if you are ok moving 
forward.

> Allow subclasses of MMapDirecory to preload individual IndexInputs
> --
>
> Key: LUCENE-8833
> URL: https://issues.apache.org/jira/browse/LUCENE-8833
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I think it's useful for subclasses to select the preload flag on a per index 
> input basis rather than all or nothing. Here is a patch that has an 
> overloaded protected openInput method. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8833) Allow subclasses of MMapDirecory to preload individual IndexInputs

2019-06-06 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858126#comment-16858126
 ] 

Robert Muir commented on LUCENE-8833:
-

Another idea is to expose the option completely differently to make it easier 
for the search use-case, maybe such as {{IndexInput.warm()}}. MMapDirectory 
could call {{load()}} on relevant bytebuffers, NIOFSDirectory could do 
whatever, ByteBuffersDirectory could do nothing. Someone could use this in 
their IndexReaderWarmer to efficiently warm their index and reduce user latency.

> Allow subclasses of MMapDirecory to preload individual IndexInputs
> --
>
> Key: LUCENE-8833
> URL: https://issues.apache.org/jira/browse/LUCENE-8833
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I think it's useful for subclasses to select the preload flag on a per index 
> input basis rather than all or nothing. Here is a patch that has an 
> overloaded protected openInput method. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8833) Allow subclasses of MMapDirecory to preload individual IndexInputs

2019-06-06 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858120#comment-16858120
 ] 

Robert Muir commented on LUCENE-8833:
-

I'm just curious about more details. For the merge use-case, it makes sense to 
hint the operating system to do some read-ahead, since the bits will be 
accessed sequentially. But this flag does a lot more than that, it will touch 
every page too. It's too bad java has no other way to hit up madvise. :)

And isn't it the case the indexinput will already be open by IndexWriter? So 
I've always been confused about the IOContext ctor for that reason. It almost 
seems like {{clone(IOContext)}} would be more useful, we could practically do 
something there.

Anyway, just some concerns about exposing too much of this flag. Even if you 
can choose it based on arbitrary complex-logic, it would still be an 
all-or-nothing "hammer" because of how java limits us as far as telling the OS 
our intentions. The current IOContext is not really utilized much, because I 
think the problem is hard.

> Allow subclasses of MMapDirecory to preload individual IndexInputs
> --
>
> Key: LUCENE-8833
> URL: https://issues.apache.org/jira/browse/LUCENE-8833
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I think it's useful for subclasses to select the preload flag on a per index 
> input basis rather than all or nothing. Here is a patch that has an 
> overloaded protected openInput method. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8833) Allow subclasses of MMapDirecory to preload individual IndexInputs

2019-06-06 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16857525#comment-16857525
 ] 

Simon Willnauer commented on LUCENE-8833:
-

> what would the iocontext provide to base the preload decision on? just 
> curious.

sure, the one I had in mind as an example is merge. I am not sure if it makes a 
big difference I was just thinking if there are other signals than the file 
extension. 
I opened LUCENE-8835 to fix the file listing issue FileSwitchDirectory has.

> Allow subclasses of MMapDirecory to preload individual IndexInputs
> --
>
> Key: LUCENE-8833
> URL: https://issues.apache.org/jira/browse/LUCENE-8833
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I think it's useful for subclasses to select the preload flag on a per index 
> input basis rather than all or nothing. Here is a patch that has an 
> overloaded protected openInput method. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8833) Allow subclasses of MMapDirecory to preload individual IndexInputs

2019-06-05 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16856997#comment-16856997
 ] 

Robert Muir commented on LUCENE-8833:
-

what would the iocontext provide to base the preload decision on? just curious.

> Allow subclasses of MMapDirecory to preload individual IndexInputs
> --
>
> Key: LUCENE-8833
> URL: https://issues.apache.org/jira/browse/LUCENE-8833
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I think it's useful for subclasses to select the preload flag on a per index 
> input basis rather than all or nothing. Here is a patch that has an 
> overloaded protected openInput method. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8833) Allow subclasses of MMapDirecory to preload individual IndexInputs

2019-06-05 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16856781#comment-16856781
 ] 

Simon Willnauer commented on LUCENE-8833:
-

you are correct that's what elasticsearch does. Yet, FileSwitchDirectory had 
many issues in the past and still has (I am working on one issue related to 
[this|https://github.com/elastic/elasticsearch/pull/37140] and will open 
another issue soon. Especially with the push of pending deletes down to 
FSDirectory things became more tricky for FileSwitchDirectory especially. That 
said I think these issue should be fixed and I will work on it it was more of a 
trigger to look closer. I also wanted to make decisions if you preload or not 
based on the IOContext down the road which FileSwitch would not be capable of 
doing in this context. I hope this makes sense?

> Allow subclasses of MMapDirecory to preload individual IndexInputs
> --
>
> Key: LUCENE-8833
> URL: https://issues.apache.org/jira/browse/LUCENE-8833
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I think it's useful for subclasses to select the preload flag on a per index 
> input basis rather than all or nothing. Here is a patch that has an 
> overloaded protected openInput method. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8833) Allow subclasses of MMapDirecory to preload individual IndexInputs

2019-06-05 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16856751#comment-16856751
 ] 

Robert Muir commented on LUCENE-8833:
-

Can't the user do this with FileSwitchDirectory today? I am just mentioning 
that, because its what I had in mind when adding this option, its not really 
"all or nothing". 

That being said I'm not necessarily opposed to the change, but there is the 
general issue of whether or not we want to support users subclassing stuff like 
MMapDirectory versus using tools like FilterDirectory/FileSwitchDirectory to 
guide behavior like this.

> Allow subclasses of MMapDirecory to preload individual IndexInputs
> --
>
> Key: LUCENE-8833
> URL: https://issues.apache.org/jira/browse/LUCENE-8833
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I think it's useful for subclasses to select the preload flag on a per index 
> input basis rather than all or nothing. Here is a patch that has an 
> overloaded protected openInput method. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org