[jira] [Created] (OAK-9790) Implement parallel indexing for speeding up oak run indexing command

2022-06-01 Thread Jun Zhang (Jira)
Jun Zhang created OAK-9790:
--

 Summary: Implement parallel indexing for speeding up oak run 
indexing command
 Key: OAK-9790
 URL: https://issues.apache.org/jira/browse/OAK-9790
 Project: Jackrabbit Oak
  Issue Type: Story
Reporter: Jun Zhang


Implement parallel indexing for speeding up oak run indexing command

Since indexing was single threads, which is slow for large repository. In order 
to improve the indexing performance we need to implement parallel indexing.

The work is cover for both lucene and elastic indexing. In order to support 
parallel indexing, it need to split the big flat file store file ahead, which 
add a big overhead, but make parallel index possible and much faster.

Another change together is support the LZ4 compression since which is much 
faster compare to gzip.

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (OAK-9760) Oak run index purge command active index check is in correct

2022-05-05 Thread Jun Zhang (Jira)
Jun Zhang created OAK-9760:
--

 Summary: Oak run index purge command active index check is in 
correct
 Key: OAK-9760
 URL: https://issues.apache.org/jira/browse/OAK-9760
 Project: Jackrabbit Oak
  Issue Type: Bug
Reporter: Jun Zhang


Oak run index purge command active index check is in correct

Currently the logic is simply based on index name, and don't include the oak 
mount and merge property check yet



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (OAK-9734) Index purge should prevent fully delete index definition which is is read-only repo

2022-03-21 Thread Jun Zhang (Jira)
Jun Zhang created OAK-9734:
--

 Summary: Index purge should prevent fully delete index definition 
which is is read-only repo
 Key: OAK-9734
 URL: https://issues.apache.org/jira/browse/OAK-9734
 Project: Jackrabbit Oak
  Issue Type: Story
Reporter: Jun Zhang


Index purge should prevent fully delete index definition which is is read-only 
repo

 

For the composite node store case, when the index come from read-only repo, the 
index definition should not be removed. otherwise, it will be recreated.

 

In addition, the disabled index should be fully deleted when the index already 
remove from the read-only repo.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (OAK-9726) Improve index purge old version commands logs

2022-03-16 Thread Jun Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-9726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Zhang updated OAK-9726:
---
Summary: Improve index purge old version commands logs  (was: Improve index 
purge old version commands)

> Improve index purge old version commands logs
> -
>
> Key: OAK-9726
> URL: https://issues.apache.org/jira/browse/OAK-9726
> Project: Jackrabbit Oak
>  Issue Type: Story
>  Components: indexing, oak-run
>Reporter: Jun Zhang
>Priority: Major
>
> Improve index purge old version commands
> The log message needs improvements for clearly show what purging does



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (OAK-9726) Improve index purge old version commands

2022-03-16 Thread Jun Zhang (Jira)
Jun Zhang created OAK-9726:
--

 Summary: Improve index purge old version commands
 Key: OAK-9726
 URL: https://issues.apache.org/jira/browse/OAK-9726
 Project: Jackrabbit Oak
  Issue Type: Story
  Components: indexing, oak-run
Reporter: Jun Zhang


Improve index purge old version commands

The log message needs improvements for clearly show what purging does



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (OAK-9705) Explain Query tool doesn't show the correct ES query when suggest queries are made

2022-02-23 Thread Jun Zhang (Jira)
Jun Zhang created OAK-9705:
--

 Summary: Explain Query tool doesn't show the correct ES query when 
suggest queries are made
 Key: OAK-9705
 URL: https://issues.apache.org/jira/browse/OAK-9705
 Project: Jackrabbit Oak
  Issue Type: Bug
Reporter: Jun Zhang


When running the following suggest query from AEM(ES index is available)
{code}
SELECT [rep:suggest()] FROM [dam:Asset] as s WHERE SUGGEST('jav') option(index 
name [damAssetElastic-7-custom-2])
{code}
the detailed plan show that the following query is run on ES side:

{code}
{"bool":\{"must":[{"query_string":{"query":"suggest?term=jav","fields":[],"type":"best_fields","default_operator":"or","max_determinized_states":1,"enable_position_increments":true,"fuzziness":"AUTO","fuzzy_prefix_length":0,"fuzzy_max_expansions":50,"phrase_slop":0,"escape":false,"auto_generate_synonyms_phrase_query":true,"fuzzy_transpositions":true,"boost":1.0}}],"adjust_pure_negative":true,"boost":1.0}}
{code}


Instead, the following query is actually run on ES side:
{code}
POST cm-p11553-e21096-publish._damassetelastic-7-custom-2/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "path": ":suggest",
            "query": {
              "match_phrase_prefix": {
                ":suggest.value": {
                  "query": "jav"
                }
              }
            }
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1
    }
  }
}
{code}

the Explain query should reflect the correct ES query as well.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (OAK-9696) Improve query syntax support for dynamicBoost in ElasticSearch

2022-02-14 Thread Jun Zhang (Jira)
Jun Zhang created OAK-9696:
--

 Summary: Improve query syntax support for dynamicBoost in 
ElasticSearch
 Key: OAK-9696
 URL: https://issues.apache.org/jira/browse/OAK-9696
 Project: Jackrabbit Oak
  Issue Type: Task
  Components: elastic-search
Reporter: Jun Zhang


Improve query syntax support for dynamicBoost in ElasticSearch

Currently, the query syntax support for dynamicBoost isn't as good as lucene, 
like the wildcard, exclude by - and explicit OR support are lack or unmatched



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (OAK-9691) Improve fulltext query syntax support for ElasticSearch

2022-02-14 Thread Jun Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-9691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17492158#comment-17492158
 ] 

Jun Zhang edited comment on OAK-9691 at 2/14/22, 6:32 PM:
--

Fixed in [https://github.com/apache/jackrabbit-oak/pull/489]


was (Author: francoiszhang):
Fixed in [https://github.com/apache/jackrabbit-oak/pull/489]

 

> Improve fulltext query syntax support for ElasticSearch
> ---
>
> Key: OAK-9691
> URL: https://issues.apache.org/jira/browse/OAK-9691
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: elastic-search
>Reporter: Jun Zhang
>Priority: Major
>
> Improve fulltext query syntax support for ElasticSearch
> Currently, the query syntax support for fulltext isn't as good as lucene, 
> like the wildcard, exclude by - and explicit OR support are lack.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (OAK-9691) Improve fulltext query syntax support for ElasticSearch

2022-02-14 Thread Jun Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-9691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17492158#comment-17492158
 ] 

Jun Zhang commented on OAK-9691:


Fixed in [https://github.com/apache/jackrabbit-oak/pull/489]

 

> Improve fulltext query syntax support for ElasticSearch
> ---
>
> Key: OAK-9691
> URL: https://issues.apache.org/jira/browse/OAK-9691
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: elastic-search
>Reporter: Jun Zhang
>Priority: Major
>
> Improve fulltext query syntax support for ElasticSearch
> Currently, the query syntax support for fulltext isn't as good as lucene, 
> like the wildcard, exclude by - and explicit OR support are lack.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (OAK-9691) Improve fulltext query syntax support for ElasticSearch

2022-02-09 Thread Jun Zhang (Jira)
Jun Zhang created OAK-9691:
--

 Summary: Improve fulltext query syntax support for ElasticSearch
 Key: OAK-9691
 URL: https://issues.apache.org/jira/browse/OAK-9691
 Project: Jackrabbit Oak
  Issue Type: Task
  Components: elastic-search
Reporter: Jun Zhang


Improve fulltext query syntax support for ElasticSearch

Currently, the query syntax support for fulltext isn't as good as lucene, like 
the wildcard, exclude by - and explicit OR support are lack.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (OAK-9671) Increase dynamicBoost and dynamicBoostLite full-text test coverage

2022-01-18 Thread Jun Zhang (Jira)
Jun Zhang created OAK-9671:
--

 Summary: Increase dynamicBoost and dynamicBoostLite full-text test 
coverage
 Key: OAK-9671
 URL: https://issues.apache.org/jira/browse/OAK-9671
 Project: Jackrabbit Oak
  Issue Type: Task
  Components: elastic-search, lucene
Reporter: Jun Zhang


dynamicBoost and dynamicBoostLite have limited full-text capabilities. The 
Elastic implementation of dynamicBoost offers some full text capability without 
affecting the index size

In general, this feature has a good test coverage for the indexing part but 
very basic tests around the queries. The reason of this is that in Lucene the 
query logic is not part of oak but it resides in an external component not 
owned by the indexing team.

The goal of this task is to:

1) improve unit tests for dynamicBoostLite (this can be done for all index 
types)

2) improve full-text unit tests for dynamicBoost in Elastic. Compared to 
Lucene, we have more flexibility since there are no dependencies with external 
code.

Once 1 is implemented, we can potentially improve full-text support for 
dynamicBoostLite using more sophisticated queries (currently a simple Term 
query is used).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (OAK-8969) Ignore domain overwrite doesn't work well when presignedHttpDownloadURICacheMaxSize is set

2020-03-24 Thread Jun Zhang (Jira)
Jun Zhang created OAK-8969:
--

 Summary: Ignore domain overwrite doesn't work well when 
presignedHttpDownloadURICacheMaxSize is set
 Key: OAK-8969
 URL: https://issues.apache.org/jira/browse/OAK-8969
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: blob-cloud
Reporter: Jun Zhang


Ignore domain overwrite doesn't work well when 
presignedHttpDownloadURICacheMaxSize is set



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (OAK-8896) API withDomainOverrideIgnored doesn't match the implementation of BinaryUploadOptions

2020-02-07 Thread Jun Zhang (Jira)
Jun Zhang created OAK-8896:
--

 Summary: API withDomainOverrideIgnored doesn't match the 
implementation of BinaryUploadOptions
 Key: OAK-8896
 URL: https://issues.apache.org/jira/browse/OAK-8896
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: api
Affects Versions: 1.24.0
Reporter: Jun Zhang


the new API withDomainOverrideIgnored though, it's named as 
{{withDomainOverrideIgnored}} for both BinaryDownloadOptions and 
BinaryUploadOptions per documentation, but it's `{{withDomainOverrideIgnore}}` 
in the implementation of BinaryUploadOptions class, it missed a ending {{d}} in 
the method name for BinaryUploadOptions,

The document is wrong ATM and the API name doesn't match with each other, 
making change over that or just leave with it and fix the document?
[https://github.com/apache/jackrabbit-oak/blob/trunk/oak-jackrabbit-api/src/main/java/org/apache/jackrabbit/api/binary/BinaryUploadOptions.java#L96]
 
[https://github.com/apache/jackrabbit-oak/blob/trunk/oak-jackrabbit-api/src/main/java/org/apache/jackrabbit/api/binary/BinaryDownloadOptions.java#L325]Doc:
[https://jackrabbit.apache.org/oak/docs/features/direct-binary-access.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)