[jira] [Commented] (OAK-8001) Lucene index can be empty (no :data node) in composite node store setup

2019-01-24 Thread Tommaso Teofili (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751983#comment-16751983
 ] 

Tommaso Teofili commented on OAK-8001:
--

+1 let's get this running and tested for a couple of weeks first.

> Lucene index can be empty (no :data node) in composite node store setup
> ---
>
> Key: OAK-8001
> URL: https://issues.apache.org/jira/browse/OAK-8001
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Major
> Fix For: 1.12, 1.11.0
>
>
> In normal setups, even if no data is written to the index, an empty (valid) 
> lucene index is created - that's useful to take care of checking 
> if-index-exists everywhere before opening the index.
> {{DefaultIndexWriter#close}} has an explicit comment stating this to be an 
> explicit intent.
> In composite node stores though, if an index doesn't get any data to be 
> indexed then {{MultiplexingIndexWriter}} never opens a {{DefaultIndexWriter}} 
> (for one or all mounts - depending on if there were some writes then which 
> mount did they hit).
> {{MultiplexingIndexWriter}} does delegate {{close}} to its opened writers but 
> that doesn't give the opportunity to {{DefaultIndexWriter#close}} into play 
> if there was no writer opened for a given mount.
> This then leads to situation in composite node stores where very empty 
> indexes can have missing {{:data}} node. In fact this was one of the causes 
> that we hit OAK-7983 in one of AEM based project.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7983) LazyLuceneIndexNode#getIndexNode can cause NPE

2019-01-24 Thread Thomas Mueller (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-7983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751978#comment-16751978
 ] 

Thomas Mueller commented on OAK-7983:
-

I think the changes in OAK-7947 should resolve the issue 
(http://svn.apache.org/r1852007).

> LazyLuceneIndexNode#getIndexNode can cause NPE
> --
>
> Key: OAK-7983
> URL: https://issues.apache.org/jira/browse/OAK-7983
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.9.13
>Reporter: Tommaso Teofili
>Assignee: Thomas Mueller
>Priority: Major
> Attachments: OAK-7983.0.patch
>
>
> Changes for OAK-7947 have introduced a LazyLuceneIndexNode. Its methods call 
> for IndexTracker#findIndexNode and #acquireIndexNode which can return `null`, 
> however no proper checks are performed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8001) Lucene index can be empty (no :data node) in composite node store setup

2019-01-24 Thread Thomas Mueller (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751979#comment-16751979
 ] 

Thomas Mueller commented on OAK-8001:
-

> backport this to 1.10
> a bit of baking period

Yes, I suggest we backport once we have better tested the changes.



> Lucene index can be empty (no :data node) in composite node store setup
> ---
>
> Key: OAK-8001
> URL: https://issues.apache.org/jira/browse/OAK-8001
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Major
> Fix For: 1.12, 1.11.0
>
>
> In normal setups, even if no data is written to the index, an empty (valid) 
> lucene index is created - that's useful to take care of checking 
> if-index-exists everywhere before opening the index.
> {{DefaultIndexWriter#close}} has an explicit comment stating this to be an 
> explicit intent.
> In composite node stores though, if an index doesn't get any data to be 
> indexed then {{MultiplexingIndexWriter}} never opens a {{DefaultIndexWriter}} 
> (for one or all mounts - depending on if there were some writes then which 
> mount did they hit).
> {{MultiplexingIndexWriter}} does delegate {{close}} to its opened writers but 
> that doesn't give the opportunity to {{DefaultIndexWriter#close}} into play 
> if there was no writer opened for a given mount.
> This then leads to situation in composite node stores where very empty 
> indexes can have missing {{:data}} node. In fact this was one of the causes 
> that we hit OAK-7983 in one of AEM based project.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7989) Build Jackrabbit Oak #1891 failed

2019-01-24 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-7989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751804#comment-16751804
 ] 

Hudson commented on OAK-7989:
-

Previously failing build now is OK.
 Passed run: [Jackrabbit Oak 
#1903|https://builds.apache.org/job/Jackrabbit%20Oak/1903/] [console 
log|https://builds.apache.org/job/Jackrabbit%20Oak/1903/console]

> Build Jackrabbit Oak #1891 failed
> -
>
> Key: OAK-7989
> URL: https://issues.apache.org/jira/browse/OAK-7989
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: continuous integration
>Reporter: Hudson
>Priority: Major
>
> No description is provided
> The build Jackrabbit Oak #1891 has failed.
> First failed run: [Jackrabbit Oak 
> #1891|https://builds.apache.org/job/Jackrabbit%20Oak/1891/] [console 
> log|https://builds.apache.org/job/Jackrabbit%20Oak/1891/console]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (OAK-8001) Lucene index can be empty (no :data node) in composite node store setup

2019-01-24 Thread Vikas Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-8001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Saurabh resolved OAK-8001.

   Resolution: Fixed
Fix Version/s: 1.11.0

Fixed on trunk at [r1852084|https://svn.apache.org/r1852084].

[~tmueller], [~teofili], I think we'd want to backport this to 1.10 (along with 
OAK-7947 and OAK-7983). But, I'd like a bit of baking period. wdyt?

> Lucene index can be empty (no :data node) in composite node store setup
> ---
>
> Key: OAK-8001
> URL: https://issues.apache.org/jira/browse/OAK-8001
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Major
> Fix For: 1.12, 1.11.0
>
>
> In normal setups, even if no data is written to the index, an empty (valid) 
> lucene index is created - that's useful to take care of checking 
> if-index-exists everywhere before opening the index.
> {{DefaultIndexWriter#close}} has an explicit comment stating this to be an 
> explicit intent.
> In composite node stores though, if an index doesn't get any data to be 
> indexed then {{MultiplexingIndexWriter}} never opens a {{DefaultIndexWriter}} 
> (for one or all mounts - depending on if there were some writes then which 
> mount did they hit).
> {{MultiplexingIndexWriter}} does delegate {{close}} to its opened writers but 
> that doesn't give the opportunity to {{DefaultIndexWriter#close}} into play 
> if there was no writer opened for a given mount.
> This then leads to situation in composite node stores where very empty 
> indexes can have missing {{:data}} node. In fact this was one of the causes 
> that we hit OAK-7983 in one of AEM based project.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7989) Build Jackrabbit Oak #1891 failed

2019-01-24 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-7989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751460#comment-16751460
 ] 

Hudson commented on OAK-7989:
-

Previously failing build now is OK.
 Passed run: [Jackrabbit Oak 
#1902|https://builds.apache.org/job/Jackrabbit%20Oak/1902/] [console 
log|https://builds.apache.org/job/Jackrabbit%20Oak/1902/console]

> Build Jackrabbit Oak #1891 failed
> -
>
> Key: OAK-7989
> URL: https://issues.apache.org/jira/browse/OAK-7989
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: continuous integration
>Reporter: Hudson
>Priority: Major
>
> No description is provided
> The build Jackrabbit Oak #1891 has failed.
> First failed run: [Jackrabbit Oak 
> #1891|https://builds.apache.org/job/Jackrabbit%20Oak/1891/] [console 
> log|https://builds.apache.org/job/Jackrabbit%20Oak/1891/console]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-8003) MongoDocumentStore does not log server details

2019-01-24 Thread Marcel Reutegger (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-8003:
--
Fix Version/s: 1.10.1

Merged into 1.10 branch: http://svn.apache.org/r1852057

> MongoDocumentStore does not log server details
> --
>
> Key: OAK-8003
> URL: https://issues.apache.org/jira/browse/OAK-8003
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: mongomk
>Affects Versions: 1.10.0
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.12, 1.10.1
>
>
> The MongoDocumentStore does not log the server details anymore on startup. 
> This is a regression introduced by changes for OAK-6087.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (OAK-8003) MongoDocumentStore does not log server details

2019-01-24 Thread Marcel Reutegger (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger resolved OAK-8003.
---
Resolution: Fixed

Fixed in trunk: http://svn.apache.org/r1852052

> MongoDocumentStore does not log server details
> --
>
> Key: OAK-8003
> URL: https://issues.apache.org/jira/browse/OAK-8003
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: mongomk
>Affects Versions: 1.10.0
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.12
>
>
> The MongoDocumentStore does not log the server details anymore on startup. 
> This is a regression introduced by changes for OAK-6087.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (OAK-8003) MongoDocumentStore does not log server details

2019-01-24 Thread Marcel Reutegger (JIRA)
Marcel Reutegger created OAK-8003:
-

 Summary: MongoDocumentStore does not log server details
 Key: OAK-8003
 URL: https://issues.apache.org/jira/browse/OAK-8003
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: mongomk
Affects Versions: 1.10.0
Reporter: Marcel Reutegger
Assignee: Marcel Reutegger
 Fix For: 1.12


The MongoDocumentStore does not log the server details anymore on startup. This 
is a regression introduced by changes for OAK-6087.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-8002) RDBDocumentStore: MissingLastRevSeeker slow on DB2

2019-01-24 Thread Julian Reschke (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-8002:

Attachment: OAK-8002.diff

> RDBDocumentStore: MissingLastRevSeeker slow on DB2
> --
>
> Key: OAK-8002
> URL: https://issues.apache.org/jira/browse/OAK-8002
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
> Fix For: 1.12
>
> Attachments: OAK-8002.diff
>
>
> This is because the generic {{MissingLastRevSeeker}} gets candidate in 
> batches (of 100), but in order to do batching, requires results to be sorted 
> by ID.
> For DB2 we by default have indices on both ID and MODIFIED, but contrary to 
> our expection a query that involves both indices does not perforn well. 
> Adding a compound index on ID *and* MODIFIED improves performance, but I'm 
> hesitant to add this just to improve a recovery job.
> A more logical approach would be not to require batching/sorting by adopting 
> the approach in {{MongoMissingLastRevSeeker}} which doesn't require sorting 
> by ID in the first place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8002) RDBDocumentStore: MissingLastRevSeeker slow on DB2

2019-01-24 Thread Julian Reschke (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751137#comment-16751137
 ] 

Julian Reschke commented on OAK-8002:
-

Potential fix: 
https://issues.apache.org/jira/secure/attachment/12956148/OAK-8002.diff

> RDBDocumentStore: MissingLastRevSeeker slow on DB2
> --
>
> Key: OAK-8002
> URL: https://issues.apache.org/jira/browse/OAK-8002
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
> Fix For: 1.12
>
> Attachments: OAK-8002.diff
>
>
> This is because the generic {{MissingLastRevSeeker}} gets candidate in 
> batches (of 100), but in order to do batching, requires results to be sorted 
> by ID.
> For DB2 we by default have indices on both ID and MODIFIED, but contrary to 
> our expection a query that involves both indices does not perforn well. 
> Adding a compound index on ID *and* MODIFIED improves performance, but I'm 
> hesitant to add this just to improve a recovery job.
> A more logical approach would be not to require batching/sorting by adopting 
> the approach in {{MongoMissingLastRevSeeker}} which doesn't require sorting 
> by ID in the first place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-8002) RDBDocumentStore: MissingLastRevSeeker slow on DB2

2019-01-24 Thread Julian Reschke (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-8002:

Fix Version/s: 1.12

> RDBDocumentStore: MissingLastRevSeeker slow on DB2
> --
>
> Key: OAK-8002
> URL: https://issues.apache.org/jira/browse/OAK-8002
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
> Fix For: 1.12
>
>
> This is because the generic {{MissingLastRevSeeker}} gets candidate in 
> batches (of 100), but in order to do batching, requires results to be sorted 
> by ID.
> For DB2 we by default have indices on both ID and MODIFIED, but contrary to 
> our expection a query that involves both indices does not perforn well. 
> Adding a compound index on ID *and* MODIFIED improves performance, but I'm 
> hesitant to add this just to improve a recovery job.
> A more logical approach would be not to require batching/sorting by adopting 
> the approach in {{MongoMissingLastRevSeeker}} which doesn't require sorting 
> by ID in the first place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7989) Build Jackrabbit Oak #1891 failed

2019-01-24 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-7989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751025#comment-16751025
 ] 

Hudson commented on OAK-7989:
-

Previously failing build now is OK.
 Passed run: [Jackrabbit Oak 
#1901|https://builds.apache.org/job/Jackrabbit%20Oak/1901/] [console 
log|https://builds.apache.org/job/Jackrabbit%20Oak/1901/console]

> Build Jackrabbit Oak #1891 failed
> -
>
> Key: OAK-7989
> URL: https://issues.apache.org/jira/browse/OAK-7989
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: continuous integration
>Reporter: Hudson
>Priority: Major
>
> No description is provided
> The build Jackrabbit Oak #1891 has failed.
> First failed run: [Jackrabbit Oak 
> #1891|https://builds.apache.org/job/Jackrabbit%20Oak/1891/] [console 
> log|https://builds.apache.org/job/Jackrabbit%20Oak/1891/console]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (OAK-8002) RDBDocumentStore: MissingLastRevSeeker slow on DB2

2019-01-24 Thread Julian Reschke (JIRA)
Julian Reschke created OAK-8002:
---

 Summary: RDBDocumentStore: MissingLastRevSeeker slow on DB2
 Key: OAK-8002
 URL: https://issues.apache.org/jira/browse/OAK-8002
 Project: Jackrabbit Oak
  Issue Type: Technical task
  Components: rdbmk
Reporter: Julian Reschke
Assignee: Julian Reschke


This is because the generic {{MissingLastRevSeeker}} gets candidate in batches 
(of 100), but in order to do batching, requires results to be sorted by ID.

For DB2 we by default have indices on both ID and MODIFIED, but contrary to our 
expection a query that involves both indices does not perforn well. Adding a 
compound index on ID *and* MODIFIED improves performance, but I'm hesitant to 
add this just to improve a recovery job.

A more logical approach would be not to require batching/sorting by adopting 
the approach in {{MongoMissingLastRevSeeker}} which doesn't require sorting by 
ID in the first place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-24 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750980#comment-16750980
 ] 

Francesco Mari commented on OAK-6749:
-

[~Csaba Varga], thanks for your contribution.

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10.1, 1.8.12, 1.10
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch, 
> repack_binaries.groovy
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> 

[jira] [Commented] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-24 Thread Csaba Varga (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750959#comment-16750959
 ] 

Csaba Varga commented on OAK-6749:
--

I've attached repack_binaries.groovy. It's a slightly anonymized version of the 
script we ran on our production repository (just removed all mentions of my 
employer).

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10.1, 1.8.12, 1.10
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch, 
> repack_binaries.groovy
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> 

[jira] [Updated] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-24 Thread Csaba Varga (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Varga updated OAK-6749:
-
Attachment: repack_binaries.groovy

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10.1, 1.8.12, 1.10
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch, 
> repack_binaries.groovy
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> at 
> 

[jira] [Commented] (OAK-7947) Lazy loading of Lucene index files startup

2019-01-24 Thread JIRA


[ 
https://issues.apache.org/jira/browse/OAK-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750952#comment-16750952
 ] 

Michael Dürig commented on OAK-7947:


Would it be feasible to add a way for users to trigger downloading of indexes? 
This could be used to e.g. start downloading indexes in the background or for 
pre-warming instances before switching them live.

Arguably this is a topic for a separate issue and lets follow up in a one if 
this is feasible at all. If not, lets forget it.

> Lazy loading of Lucene index files startup
> --
>
> Key: OAK-7947
> URL: https://issues.apache.org/jira/browse/OAK-7947
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene, query
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>Priority: Major
> Fix For: 1.12
>
> Attachments: OAK-7947.patch, OAK-7947_v2.patch, OAK-7947_v3.patch, 
> OAK-7947_v4.patch, OAK-7947_v5.patch, lucene-index-open-access.zip
>
>
> Right now, all Lucene index binaries are loaded on startup (I think when the 
> first query is run, to do cost calculation). This is a performance problem if 
> the index files are large, and need to be downloaded from the data store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7947) Lazy loading of Lucene index files startup

2019-01-24 Thread Tommaso Teofili (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750947#comment-16750947
 ] 

Tommaso Teofili commented on OAK-7947:
--

+1, thanks Thomas, I think it sounds like the most reasonable compromise for 
the current situation.

> Lazy loading of Lucene index files startup
> --
>
> Key: OAK-7947
> URL: https://issues.apache.org/jira/browse/OAK-7947
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene, query
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>Priority: Major
> Fix For: 1.12
>
> Attachments: OAK-7947.patch, OAK-7947_v2.patch, OAK-7947_v3.patch, 
> OAK-7947_v4.patch, OAK-7947_v5.patch, lucene-index-open-access.zip
>
>
> Right now, all Lucene index binaries are loaded on startup (I think when the 
> first query is run, to do cost calculation). This is a performance problem if 
> the index files are large, and need to be downloaded from the data store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-7947) Lazy loading of Lucene index files startup

2019-01-24 Thread Thomas Mueller (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller updated OAK-7947:

Fix Version/s: 1.12

> Lazy loading of Lucene index files startup
> --
>
> Key: OAK-7947
> URL: https://issues.apache.org/jira/browse/OAK-7947
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene, query
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>Priority: Major
> Fix For: 1.12
>
> Attachments: OAK-7947.patch, OAK-7947_v2.patch, OAK-7947_v3.patch, 
> OAK-7947_v4.patch, OAK-7947_v5.patch, lucene-index-open-access.zip
>
>
> Right now, all Lucene index binaries are loaded on startup (I think when the 
> first query is run, to do cost calculation). This is a performance problem if 
> the index files are large, and need to be downloaded from the data store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7947) Lazy loading of Lucene index files startup

2019-01-24 Thread Thomas Mueller (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750944#comment-16750944
 ] 

Thomas Mueller commented on OAK-7947:
-

http://svn.apache.org/r1852007 (trunk)
includes the LuceneIndexMBeanImpl patch above (so, index update doesn't 
download the indexes to get stats, except if the system property is set).

> Lazy loading of Lucene index files startup
> --
>
> Key: OAK-7947
> URL: https://issues.apache.org/jira/browse/OAK-7947
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene, query
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>Priority: Major
> Attachments: OAK-7947.patch, OAK-7947_v2.patch, OAK-7947_v3.patch, 
> OAK-7947_v4.patch, OAK-7947_v5.patch, lucene-index-open-access.zip
>
>
> Right now, all Lucene index binaries are loaded on startup (I think when the 
> first query is run, to do cost calculation). This is a performance problem if 
> the index files are large, and need to be downloaded from the data store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-24 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750930#comment-16750930
 ] 

Francesco Mari commented on OAK-6749:
-

[~Csaba Varga], I think there is some value in sharing your script on this 
issue. If other users will be stuck with this problem in 1.6, this issue will 
probably pop up. If that happens, you would surely do a great service to 
someone else!

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10.1, 1.8.12, 1.10
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> 

[jira] [Commented] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-24 Thread Csaba Varga (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750926#comment-16750926
 ] 

Csaba Varga commented on OAK-6749:
--

{quote}Csaba Varga, for the sake of completeness, can you describe how you 
worked around the problem?
{quote}
Sure!

I used the "console" mode of oak-run to run a custom Groovy script on the 
Segment-Tar version of the repository, i.e. directly after completing the 
sidegrade from Mongo. This custom script used the Oak API directly to recreate 
the affected blobs (blobs where InMemoryDataRecord.isInstance() returns true 
for the ID). Blobs created by the Segment-Tar implementation will never be 
InMemoryDataRecord instances (at least as of 1.6), so this procedure allowed me 
to sidestep the issue with syncing those instances.

The aggressive de-duplicating behavior of Oak, which is normally a good thing, 
caused me some headaches while writing the script. I ended up creating new 
properties on the affected nodes with a predictable name (a fixed string 
prepended to the original name) and deleting the affected properties to make 
sure the new blobs are referenced and the old "in-memory" blob IDs are left 
unreferenced. Then, as a second phase, I re-created the original properties 
based on the prefixed ones and deleted the prefixed ones. The end result was a 
Segment-Tar repository with the exact same semantics as the original, but with 
no "in-memory" blob IDs referenced by the head revision.

I can share the Groovy script itself if you're interested in the gory details.

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10.1, 1.8.12, 1.10
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> 

[jira] [Commented] (OAK-7947) Lazy loading of Lucene index files startup

2019-01-24 Thread Thomas Mueller (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750890#comment-16750890
 ] 

Thomas Mueller commented on OAK-7947:
-

The following addition doesn't download the indexes (only updates the stats for 
the indexes that are already downloaded, that is, only for those that are shown 
in the JMX bean table).

Maybe we could have some "middle ground", that is, by default download the 
indexes during the index upgrade cycle, but only those that aren't deprecated. 
That way, index update only doesn't cause large deprecated indexes to be 
downloaded. For non-deprecated indexes, I think it's actually good to download 
them quite early on, and the index update mechanism sounds like a good 
mechanism for that.

{noformat}
--- 
src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndexMBeanImpl.java
  (revision 1851902)
+++ 
src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndexMBeanImpl.java
  (working copy)
@@ -93,6 +93,8 @@
 
 public class LuceneIndexMBeanImpl extends AnnotatedStandardMBean implements 
LuceneIndexMBean {
 
+private static final boolean LOAD_INDEX_FOR_STATS = 
Boolean.parseBoolean(System.getProperty("oak.lucene.LoadIndexForStats", 
"false"));
+
 private final Logger log = LoggerFactory.getLogger(getClass());
 private final IndexTracker indexTracker;
 private final NodeStore nodeStore;
@@ -381,11 +383,21 @@
 
 @Override
 public String getSize(String indexPath) throws IOException {
+if (!LOAD_INDEX_FOR_STATS) {
+if (!indexTracker.getIndexNodePaths().contains(indexPath)) {
+return "-1";
+}
+}
 return String.valueOf(getIndexStats(indexPath).indexSize);
 }
 
 @Override
 public String getDocCount(String indexPath) throws IOException {
+if (!LOAD_INDEX_FOR_STATS) {
+if (!indexTracker.getIndexNodePaths().contains(indexPath)) {
+return "-1";
+}
+}
 return String.valueOf(getIndexStats(indexPath).numDocs);
 }
{noformat}

> Lazy loading of Lucene index files startup
> --
>
> Key: OAK-7947
> URL: https://issues.apache.org/jira/browse/OAK-7947
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene, query
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>Priority: Major
> Attachments: OAK-7947.patch, OAK-7947_v2.patch, OAK-7947_v3.patch, 
> OAK-7947_v4.patch, OAK-7947_v5.patch, lucene-index-open-access.zip
>
>
> Right now, all Lucene index binaries are loaded on startup (I think when the 
> first query is run, to do cost calculation). This is a performance problem if 
> the index files are large, and need to be downloaded from the data store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7947) Lazy loading of Lucene index files startup

2019-01-24 Thread Thomas Mueller (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750871#comment-16750871
 ] 

Thomas Mueller commented on OAK-7947:
-

[~teofili] [~catholicon] maybe it's alright if the async index update downloads 
all the index files (even if the index wasn't updated or used so far), what do 
you think?

What about adding a system property so behavior (basically OAK-7893) this can 
be disabled? If I do that and set the system property, then at startup only the 
indexes that I would expect are downloaded.

> Lazy loading of Lucene index files startup
> --
>
> Key: OAK-7947
> URL: https://issues.apache.org/jira/browse/OAK-7947
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene, query
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>Priority: Major
> Attachments: OAK-7947.patch, OAK-7947_v2.patch, OAK-7947_v3.patch, 
> OAK-7947_v4.patch, OAK-7947_v5.patch, lucene-index-open-access.zip
>
>
> Right now, all Lucene index binaries are loaded on startup (I think when the 
> first query is run, to do cost calculation). This is a performance problem if 
> the index files are large, and need to be downloaded from the data store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7947) Lazy loading of Lucene index files startup

2019-01-24 Thread Thomas Mueller (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750849#comment-16750849
 ] 

Thomas Mueller commented on OAK-7947:
-

Index copying takes place here:
{noformat}
at 
org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnReadDirectory.(CopyOnReadDirectory.java:83)
 [org.apache.jackrabbit.oak-lucene:1.12.0.SNAPSHOT]
at 
org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier.wrapForRead(IndexCopier.java:124)
 [org.apache.jackrabbit.oak-lucene:1.12.0.SNAPSHOT]
at 
org.apache.jackrabbit.oak.plugins.index.lucene.reader.DefaultIndexReaderFactory.createReader(DefaultIndexReaderFactory.java:97)
 [org.apache.jackrabbit.oak-lucene:1.12.0.SNAPSHOT]
at 
org.apache.jackrabbit.oak.plugins.index.lucene.reader.DefaultIndexReaderFactory.createReader(DefaultIndexReaderFactory.java:85)
 [org.apache.jackrabbit.oak-lucene:1.12.0.SNAPSHOT]
at 
org.apache.jackrabbit.oak.plugins.index.lucene.reader.DefaultIndexReaderFactory.createMountedReaders(DefaultIndexReaderFactory.java:67)
 [org.apache.jackrabbit.oak-lucene:1.12.0.SNAPSHOT]
at 
org.apache.jackrabbit.oak.plugins.index.lucene.reader.DefaultIndexReaderFactory.createReaders(DefaultIndexReaderFactory.java:60)
 [org.apache.jackrabbit.oak-lucene:1.12.0.SNAPSHOT]
at 
org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexNodeManager.open(LuceneIndexNodeManager.java:72)
 [org.apache.jackrabbit.oak-lucene:1.12.0.SNAPSHOT]
at 
org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.findIndexNode(IndexTracker.java:243)
 [org.apache.jackrabbit.oak-lucene:1.12.0.SNAPSHOT]
at 
org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.acquireIndexNode(IndexTracker.java:212)
 [org.apache.jackrabbit.oak-lucene:1.12.0.SNAPSHOT]
at 
org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexMBeanImpl.getIndexStats(LuceneIndexMBeanImpl.java:143)
 [org.apache.jackrabbit.oak-lucene:1.12.0.SNAPSHOT]
at 
org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexMBeanImpl.getDocCount(LuceneIndexMBeanImpl.java:389)
 [org.apache.jackrabbit.oak-lucene:1.12.0.SNAPSHOT]
at 
org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexStatsUpdateCallback.done(LuceneIndexStatsUpdateCallback.java:64)
 [org.apache.jackrabbit.oak-lucene:1.12.0.SNAPSHOT]
at 
org.apache.jackrabbit.oak.plugins.index.search.CompositePropertyUpdateCallback.done(CompositePropertyUpdateCallback.java:53)
 [org.apache.jackrabbit.oak-lucene:1.12.0.SNAPSHOT]
at 
org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.leave(LuceneIndexEditor.java:157)
 [org.apache.jackrabbit.oak-lucene:1.12.0.SNAPSHOT]
at 
org.apache.jackrabbit.oak.plugins.index.IndexUpdate.leave(IndexUpdate.java:397) 
[org.apache.jackrabbit.oak-core:1.10.0.SNAPSHOT]
at 
org.apache.jackrabbit.oak.spi.commit.VisibleEditor.leave(VisibleEditor.java:59) 
[org.apache.jackrabbit.oak-store-spi:1.9.10.R1845889]
at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:55) 
[org.apache.jackrabbit.oak-store-spi:1.9.10.R1845889]
at 
org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.updateIndex(AsyncIndexUpdate.java:728)
 [org.apache.jackrabbit.oak-core:1.10.0.SNAPSHOT]
at 
org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.runWhenPermitted(AsyncIndexUpdate.java:573)
 [org.apache.jackrabbit.oak-core:1.10.0.SNAPSHOT]
at 
org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.run(AsyncIndexUpdate.java:432)
 [org.apache.jackrabbit.oak-core:1.10.0.SNAPSHOT]
at 
org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:347)
 [org.apache.sling.commons.scheduler:2.7.2]
at org.quartz.core.JobRunShell.run(JobRunShell.java:202) 
[org.apache.sling.commons.scheduler:2.7.2]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

It looks like this is relatively new code, due to OAK-7893. Calculating index 
statistics right now causes indexes to be downloaded. This happens for every 
Lucene index, at every index update (whether or not a specific index was 
changed).



> Lazy loading of Lucene index files startup
> --
>
> Key: OAK-7947
> URL: https://issues.apache.org/jira/browse/OAK-7947
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene, query
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>Priority: Major
> Attachments: OAK-7947.patch, OAK-7947_v2.patch, OAK-7947_v3.patch, 
> OAK-7947_v4.patch, OAK-7947_v5.patch, lucene-index-open-access.zip
>
>
> Right now, all