date:20220725



JoeHF commented on PR #1003:
URL: https://github.com/apache/lucene/pull/1003#issuecomment-1195050814

   no obvious regression or perf improvement, guess there are no such cases in 
benchmark
   wikimedium10k:
   TaskQPS baseline  StdDevQPS 
my_modified_version  StdDevPct diff p-value
BrowseRandomLabelTaxoFacets  569.65  (7.9%)  543.58 
(15.4%)   -4.6% ( -25% -   20%) 0.236
Prefix3  377.77  (9.1%)  368.32  
(6.6%)   -2.5% ( -16% -   14%) 0.321
 AndHighMed  656.18  (8.2%)  648.39 
(10.6%)   -1.2% ( -18% -   19%) 0.691
MedIntervalsOrdered  574.68  (6.3%)  567.95  
(9.7%)   -1.2% ( -16% -   15%) 0.651
 AndHighLow  978.77  (9.3%)  972.00  
(8.5%)   -0.7% ( -16% -   18%) 0.806
   HighSpanNear  425.66  (8.3%)  423.78 
(10.2%)   -0.4% ( -17% -   19%) 0.880
  OrHighMed  656.72  (8.2%)  655.28 
(10.5%)   -0.2% ( -17% -   20%) 0.942
LowIntervalsOrdered  481.42  (5.2%)  480.65 
(10.3%)   -0.2% ( -14% -   16%) 0.951
 HighPhrase  500.26  (7.6%)  499.86 
(11.4%)   -0.1% ( -17% -   20%) 0.979
Respell  123.33 (11.8%)  123.48 
(10.4%)0.1% ( -19% -   25%) 0.973
 OrHighHigh  416.58  (6.9%)  417.19  
(9.4%)0.1% ( -15% -   17%) 0.955
MedTerm 2063.41  (9.5%) 2069.51 
(11.0%)0.3% ( -18% -   23%) 0.928
LowSloppyPhrase  301.12  (7.5%)  303.12 
(12.6%)0.7% ( -18% -   22%) 0.840
   HighTerm 1088.05  (9.8%) 1102.10 
(14.8%)1.3% ( -21% -   28%) 0.745
  LowPhrase  896.10  (8.4%)  907.71  
(9.8%)1.3% ( -15% -   21%) 0.654
   HighSloppyPhrase  309.31  (8.1%)  313.60 
(10.0%)1.4% ( -15% -   21%) 0.629
 Fuzzy2   42.78 (11.1%)   43.46 
(12.2%)1.6% ( -19% -   27%) 0.665
   Wildcard  315.36  (9.2%)  320.46  
(7.7%)1.6% ( -14% -   20%) 0.548
MedSpanNear  520.33  (6.6%)  530.21 
(11.6%)1.9% ( -15% -   21%) 0.524
   HighIntervalsOrdered  356.49 (10.3%)  363.39 
(10.1%)1.9% ( -16% -   24%) 0.547
AndHighHigh  619.32  (5.9%)  631.54  
(9.5%)2.0% ( -12% -   18%) 0.432
  HighTermMonthSort 1479.95  (6.0%) 1509.95 
(11.1%)2.0% ( -14% -   20%) 0.472
MedSloppyPhrase  230.30  (8.6%)  235.24 
(10.8%)2.1% ( -15% -   23%) 0.488
  MedPhrase  567.04  (6.2%)  579.72 
(11.5%)2.2% ( -14% -   21%) 0.442
BrowseRandomLabelSSDVFacets  350.13 (10.2%)  358.12 
(16.8%)2.3% ( -22% -   32%) 0.604
  HighTermDayOfYearSort 1087.80  (7.4%) 1118.61  
(8.5%)2.8% ( -12% -   20%) 0.260
LowTerm 2557.43  (9.2%) 2636.37  
(8.9%)3.1% ( -13% -   23%) 0.281
LowSpanNear  795.88  (9.0%)  828.70 
(11.1%)4.1% ( -14% -   26%) 0.195
   PKLookup   26.79 (16.3%)   27.91 
(19.8%)4.2% ( -27% -   48%) 0.466
 Fuzzy1  136.23  (9.7%)  142.21 
(16.8%)4.4% ( -20% -   34%) 0.312
  BrowseMonthTaxoFacets  801.97 (17.7%)  840.43 
(19.1%)4.8% ( -27% -   50%) 0.410
 IntNRQ  603.46 (10.0%)  636.52  
(7.9%)5.5% ( -11% -   25%) 0.054
  OrHighLow  532.25  (9.0%)  562.37 
(13.6%)5.7% ( -15% -   31%) 0.121
  BrowseMonthSSDVFacets  839.55 (20.9%)  894.35 
(22.1%)6.5% ( -30% -   62%) 0.337
  BrowseDayOfYearTaxoFacets  784.80 (16.4%)  839.36 
(25.1%)7.0% ( -29% -   58%) 0.300
   BrowseDateTaxoFacets  849.34 (17.9%)  908.75 
(25.8%)7.0% ( -31% -   61%) 0.319
  BrowseDayOfYearSSDVFacets  832.41 (17.6%)  907.43 
(22.9%)9.0% ( -26% -   60%) 0.163
   BrowseDateSSDVFacets  215.63 (21.1%)  241.38 
(27.5%)   11.9% ( -30% -   76%) 0.123
   
   wikimedium1m:
   TaskQPS baseline  StdDevQPS 
my_modified_version  StdDevPct diff p-value
Respell   39.67 (11.8%)   37.57 
(14.2%)   -5.3% ( -27% -   23%) 0.200
  BrowseDayOfYear

[GitHub] [lucene] visionarywind opened a new issue, #1048: Why lucene doc id changes after updating or merging?



visionarywind opened a new issue, #1048:
URL: https://github.com/apache/lucene/issues/1048

   ### Description
   
   As I know, lucene doc is a internal docId, it cannot be used as an external 
id.  
   Why is it designed like this ?
   Could it be designed to be constant ?
   Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] nknize commented on pull request #1017: LUCENE-10654: Add new ShapeDocValuesField for LatLonShape and XYShape



nknize commented on PR #1017:
URL: https://github.com/apache/lucene/pull/1017#issuecomment-1194649110

   Thanks @iverase.
   
   > Why so much hurry with this change? ...it will be nice to have something 
production ready.
   
   Just a few points here. 
   1. I don't believe the proposed PR is a change. It's a new field that hasn't 
existed before despite Elasticsearch carrying it's own proprietary 
implementation for two years and only now proposing it as the preferred 
approach. I prefer a fresh Lucene focused implementation with input from 
Elasticsearch perspective. (e.g., geometry centroid and bounding box were added 
to help Elasticsearch `geo_centroid` and `geo_boundingbox` aggregations but 
really not needed for the docvalue format).
   2. This PR is a move of progress over perfection. I do feel it's important 
to do the best we can on the first iteration but there will always be needed 
improvements. We have mechanisms in place to enable us to unleash experimental 
features for the purpose of receiving feedback from production. The PR is using 
those mechanisms as intended. No need to iterate to what _we think_ is 
"production ready".
   3. Akin to 2 I'd prefer to avoid waiting another two months+ before the next 
minor release to get this in the wild and start obtaining that feedback and 
iterating. Those iterations will improve with more feedback. If we do feel this 
is cutting it close for 9.3 then I'd prefer merging this PR to main and 9.4 and 
iterate on this code for 9.4. 
   4. I am happy to hear Elasticsearch wanting to contribute their 
implementation but I think it's better to start with a foundation and iterate. 
We can merge any desirable properties from that Elasticsearch field in follow 
ups. I think that will strengthen the field as a whole and agree it's a great 
decision but do not believe it is a requirement before merging the current 
functioning PR. Again, progress over perfection.
   
   > this way of developing this exciting and complex feature is making things 
harder.
   
   Harder for who?
   
   > I would like to propose to initially focus on the data structure and once 
we are happy, we can start integrating the functionality, e.g support for 
queries and so on. 
   
   Query support is already in this PR so I'm not sure what you mean by "start 
integrating the functionality". Regarding the desire to add a visitor access 
pattern that's a nice to have but not a requirement for this PR. I wrote an 
initial rough implementation (because I also thought it would be nice to mimic 
the query visitor pattern) but it's an improvement that is easier to add in a 
follow-up since (as this PR shows) it's not a requirement for supporting the 
queries in the first iteration. I agree with the bounding box improvements 
(which, again, could come later) and will add the centroid fix, but that is 
unrelated to the relation and query logic in this PR which already has parity 
with the BKD index queries and uses the same test scaffolding. 
   
   > Here is our current implementation which can be used as a good starting 
point. 
   
   The current PR already has a starting point so I'm not sure why the proposal 
to scrap and start from the Elasticsearch proprietary implementation (which 
could've been proposed two years ago if parity is the concern). I took a quick 
look at the proposed code and have some differing of opinions on the 
implementation:
   
   
   * I didn't see any explicit relation visitors; so it doesn't look like the 
prototype code includes bounding box relation logic or tests against any BKD 
index queries to ensure functional parity.
   * I didn't look at the details of the serialized format. That shouldn't 
matter so long as the API is the same and results are correct. This PR now 
includes a `VERSION` byte to provide a mechanism to change the format and 
ensure backwards compatibility. This reduces Elasticsearch risk.
   * The PR results here are matching the BKD index queries so I'm confident 
the PR query results are correct. I'll add (or someone can add) the visitor 
access pattern in a follow up. Again, it's not a requirement for Lucene (even 
if it is one for Elasticsearch).
   * This PR isolates all ShapeDocValues logic (e.g., Readers and Writers) as 
private logic to the single pkg private abstract `ShapeDocValues` and only 
exposes field and query instantiation through the public `LatLonShape` and 
`XYShape` access classes. I prefer this approach (which is consistent w/ the 
BKD field API) over the prototype that conversely has that split into disparate 
separate Reader / Writer classes with public abstractions. I think we should 
keep the abstraction and internal API surface area tidy and be thoughtful about 
what's exposed (e.g., only pkg-private `ShapeDocValues` and 
`ShapeDocValueField`. Visitor member class and BaseQuery foundation classes for 
internal extensions, and `LatLonShape` and `XYShape` static factories for 
public access).
   
   I'll add th

[jira] [Comment Edited] (LUCENE-10592) Should we build HNSW graph on the fly during indexing



[ 
https://issues.apache.org/jira/browse/LUCENE-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570976#comment-17570976
 ] 

Julie Tibshirani edited comment on LUCENE-10592 at 7/25/22 4:10 PM:


It looks like this commit gave a nice boost to indexing. From your benchmark 
results, we expected a small improvement, but this looks even larger:

!Screen Shot 2022-07-25 at 9.04.11 AM.png|width=582,height=238!


was (Author: julietibs):
It looks like this commit gave a nice boost to indexing. From your benchmark 
results, we expected a small improvement, but this looks even larger:

!Screen Shot 2022-07-25 at 9.04.11 AM.png|width=540,height=221!

> Should we build HNSW graph on the fly during indexing
> -
>
> Key: LUCENE-10592
> URL: https://issues.apache.org/jira/browse/LUCENE-10592
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mayya Sharipova
>Assignee: Mayya Sharipova
>Priority: Minor
> Fix For: 9.4
>
> Attachments: Screen Shot 2022-07-25 at 9.04.11 AM.png
>
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Currently, when we index vectors for KnnVectorField, we buffer those vectors 
> in memory and on flush during a segment construction we build an HNSW graph.  
> As building an HNSW graph is very expensive, this makes flush operation take 
> a lot of time. This also makes overall indexing performance quite 
> unpredictable (as the number of flushes are defined by memory used, and the 
> presence of concurrent searches), e.g. some indexing operations return almost 
> instantly while others that trigger flush take a lot of time. 
> Building an HNSW graph on the fly as we index vectors allows to avoid this 
> problem, and spread a load of HNSW graph construction evenly during indexing.
> This will also supersede LUCENE-10194



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-10592) Should we build HNSW graph on the fly during indexing



[ 
https://issues.apache.org/jira/browse/LUCENE-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570976#comment-17570976
 ] 

Julie Tibshirani edited comment on LUCENE-10592 at 7/25/22 4:10 PM:


It looks like this commit gave a nice boost to indexing. From your benchmark 
results, we expected a small improvement, but this looks even larger:

!Screen Shot 2022-07-25 at 9.04.11 AM.png|width=692,height=283!


was (Author: julietibs):
It looks like this commit gave a nice boost to indexing. From your benchmark 
results, we expected a small improvement, but this looks even larger:

!Screen Shot 2022-07-25 at 9.04.11 AM.png|width=582,height=238!

> Should we build HNSW graph on the fly during indexing
> -
>
> Key: LUCENE-10592
> URL: https://issues.apache.org/jira/browse/LUCENE-10592
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mayya Sharipova
>Assignee: Mayya Sharipova
>Priority: Minor
> Fix For: 9.4
>
> Attachments: Screen Shot 2022-07-25 at 9.04.11 AM.png
>
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Currently, when we index vectors for KnnVectorField, we buffer those vectors 
> in memory and on flush during a segment construction we build an HNSW graph.  
> As building an HNSW graph is very expensive, this makes flush operation take 
> a lot of time. This also makes overall indexing performance quite 
> unpredictable (as the number of flushes are defined by memory used, and the 
> presence of concurrent searches), e.g. some indexing operations return almost 
> instantly while others that trigger flush take a lot of time. 
> Building an HNSW graph on the fly as we index vectors allows to avoid this 
> problem, and spread a load of HNSW graph construction evenly during indexing.
> This will also supersede LUCENE-10194



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-10592) Should we build HNSW graph on the fly during indexing



[ 
https://issues.apache.org/jira/browse/LUCENE-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570976#comment-17570976
 ] 

Julie Tibshirani edited comment on LUCENE-10592 at 7/25/22 4:10 PM:


It looks like this commit gave a nice boost to indexing. From your benchmark 
results, we expected a small improvement, but this looks even larger:

!Screen Shot 2022-07-25 at 9.04.11 AM.png|width=540,height=221!


was (Author: julietibs):
It looks like this commit gave a nice boost to indexing. From your benchmark 
results, we expected a small improvement, but this looks even larger:

 !Screen Shot 2022-07-25 at 9.04.11 AM.png! 

> Should we build HNSW graph on the fly during indexing
> -
>
> Key: LUCENE-10592
> URL: https://issues.apache.org/jira/browse/LUCENE-10592
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mayya Sharipova
>Assignee: Mayya Sharipova
>Priority: Minor
> Fix For: 9.4
>
> Attachments: Screen Shot 2022-07-25 at 9.04.11 AM.png
>
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Currently, when we index vectors for KnnVectorField, we buffer those vectors 
> in memory and on flush during a segment construction we build an HNSW graph.  
> As building an HNSW graph is very expensive, this makes flush operation take 
> a lot of time. This also makes overall indexing performance quite 
> unpredictable (as the number of flushes are defined by memory used, and the 
> presence of concurrent searches), e.g. some indexing operations return almost 
> instantly while others that trigger flush take a lot of time. 
> Building an HNSW graph on the fly as we index vectors allows to avoid this 
> problem, and spread a load of HNSW graph construction evenly during indexing.
> This will also supersede LUCENE-10194



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10592) Should we build HNSW graph on the fly during indexing



[ 
https://issues.apache.org/jira/browse/LUCENE-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570976#comment-17570976
 ] 

Julie Tibshirani commented on LUCENE-10592:
---

It looks like this commit gave a nice boost to indexing. From your benchmark 
results, we expected a small improvement, but this looks even larger:

 !Screen Shot 2022-07-25 at 9.04.11 AM.png! 

> Should we build HNSW graph on the fly during indexing
> -
>
> Key: LUCENE-10592
> URL: https://issues.apache.org/jira/browse/LUCENE-10592
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mayya Sharipova
>Assignee: Mayya Sharipova
>Priority: Minor
> Fix For: 9.4
>
> Attachments: Screen Shot 2022-07-25 at 9.04.11 AM.png
>
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Currently, when we index vectors for KnnVectorField, we buffer those vectors 
> in memory and on flush during a segment construction we build an HNSW graph.  
> As building an HNSW graph is very expensive, this makes flush operation take 
> a lot of time. This also makes overall indexing performance quite 
> unpredictable (as the number of flushes are defined by memory used, and the 
> presence of concurrent searches), e.g. some indexing operations return almost 
> instantly while others that trigger flush take a lot of time. 
> Building an HNSW graph on the fly as we index vectors allows to avoid this 
> problem, and spread a load of HNSW graph construction evenly during indexing.
> This will also supersede LUCENE-10194



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-10592) Should we build HNSW graph on the fly during indexing



 [ 
https://issues.apache.org/jira/browse/LUCENE-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julie Tibshirani updated LUCENE-10592:
--
Attachment: Screen Shot 2022-07-25 at 9.04.11 AM.png

> Should we build HNSW graph on the fly during indexing
> -
>
> Key: LUCENE-10592
> URL: https://issues.apache.org/jira/browse/LUCENE-10592
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mayya Sharipova
>Assignee: Mayya Sharipova
>Priority: Minor
> Fix For: 9.4
>
> Attachments: Screen Shot 2022-07-25 at 9.04.11 AM.png
>
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Currently, when we index vectors for KnnVectorField, we buffer those vectors 
> in memory and on flush during a segment construction we build an HNSW graph.  
> As building an HNSW graph is very expensive, this makes flush operation take 
> a lot of time. This also makes overall indexing performance quite 
> unpredictable (as the number of flushes are defined by memory used, and the 
> presence of concurrent searches), e.g. some indexing operations return almost 
> instantly while others that trigger flush take a lot of time. 
> Building an HNSW graph on the fly as we index vectors allows to avoid this 
> problem, and spread a load of HNSW graph construction evenly during indexing.
> This will also supersede LUCENE-10194



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta merged pull request #81: add my account into maping data



mocobeta merged PR #81:
URL: https://github.com/apache/lucene-jira-archive/pull/81


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta commented on pull request #81: add my account into maping data



mocobeta commented on PR #81:
URL: 
https://github.com/apache/lucene-jira-archive/pull/81#issuecomment-1194161000

   Thaks @tang-hi 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on a diff in pull request #80: #79: include parent issue link



mikemccand commented on code in PR #80:
URL: https://github.com/apache/lucene-jira-archive/pull/80#discussion_r928980448


##
migration/src/jira_util.py:
##
@@ -83,6 +83,15 @@ def extract_assignee(o: dict) -> tuple[str, str]:
 return (name, disp_name)
 
 
+def extract_parent(o: dict) -> tuple[str, str]:
+parent = o["fields"].get("parent")
+if parent:
+key = parent["key"]
+if key:
+return key, f'https://issues.apache.org/jira/browse/{key}'

Review Comment:
   Ahh good idea, will do.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta commented on issue #29: Can/should we make Jira read-only on migration to GitHub issues?



mocobeta commented on issue #29:
URL: 
https://github.com/apache/lucene-jira-archive/issues/29#issuecomment-1194132316

   I would open two issues at the same time when we ask infra to start the 
migration; one for running the import script, and one for making Jira 
read-only. Anyway, it'll need some time to explain our plan and perform the 
migration.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] iverase commented on a diff in pull request #1017: LUCENE-10654: Add new ShapeDocValuesField for LatLonShape and XYShape



iverase commented on code in PR #1017:
URL: https://github.com/apache/lucene/pull/1017#discussion_r928955670


##
lucene/core/src/java/org/apache/lucene/document/ShapeDocValues.java:
##
@@ -0,0 +1,907 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.document;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Comparator;
+import java.util.List;
+import org.apache.lucene.document.ShapeField.DecodedTriangle.TYPE;
+import org.apache.lucene.document.SpatialQuery.EncodedRectangle;
+import org.apache.lucene.geo.Component2D;
+import org.apache.lucene.index.PointValues.Relation;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.store.ByteArrayDataInput;
+import org.apache.lucene.store.ByteBuffersDataOutput;
+import org.apache.lucene.store.DataInput;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.BytesRef;
+
+/**
+ * A binary doc values format representation for {@link LatLonShape} and 
{@link XYShape}
+ *
+ * Note that this class cannot be instantiated directly due to different 
encodings {@link
+ * org.apache.lucene.geo.XYEncodingUtils} and {@link 
org.apache.lucene.geo.GeoEncodingUtils}
+ *
+ * Concrete Implementations include: {@link LatLonShapeDocValues} and 
{@link XYShapeDocValues}
+ *
+ * @lucene.experimental
+ */
+abstract class ShapeDocValues {
+  /** doc value format version; used to support bwc for any encoding changes */
+  protected static final byte VERSION = 0;
+  /** the binary doc value */
+  private final BytesRef data;
+  /** the geometry comparator used to check relations */
+  protected final ShapeComparator shapeComparator;
+
+  /**
+   * Creates a {@ShapeDocValues} instance from a shape tessellation
+   *
+   * @param tessellation The tessellation (must not be null)
+   */
+  ShapeDocValues(List tessellation) {
+this.data = computeBinaryValue(tessellation);
+try {
+  this.shapeComparator = new ShapeComparator(this.data);
+} catch (IOException e) {
+  throw new IllegalArgumentException("unable to read binary shape doc 
value field. ", e);
+}
+  }
+
+  /** Creates a {@code ShapeDocValues} instance from a given serialized value 
*/
+  ShapeDocValues(BytesRef binaryValue) {
+this.data = binaryValue;
+try {
+  this.shapeComparator = new ShapeComparator(this.data);
+} catch (IOException e) {
+  throw new IllegalArgumentException("unable to read binary shape doc 
value field. ", e);
+}
+  }
+
+  /** returns the encoded doc values field as a {@link BytesRef} */
+  protected BytesRef binaryValue() {
+return this.data;
+  }
+
+  /** Returns the number of terms (tessellated triangles) for this shape */
+  public int numberOfTerms() {
+return shapeComparator.numberOfTerms();
+  }
+
+  /** returns the min x value for the shape's bounding box */
+  public int getMinX() {
+return shapeComparator.getMinX();
+  }
+
+  /** returns the min y value for the shape's bounding box */
+  public int getMinY() {
+return shapeComparator.getMinY();
+  }
+
+  /** returns the max x value for the shape's bounding box */
+  public int getMaxX() {
+return shapeComparator.getMaxX();
+  }
+
+  /** returns the max y value for the shape's bounding box */
+  public int getMaxY() {
+return shapeComparator.getMaxY();
+  }
+
+  /** Retrieves the x centroid location for the geometry(s) */
+  public int getCentroidX() {
+return shapeComparator.getCentroidX();
+  }
+
+  /** Retrieves the y centroid location for the geometry(s) */
+  public int getCentroidY() {
+return shapeComparator.getCentroidY();
+  }
+
+  /**
+   * Retrieves the highest dimensional type (POINT, LINE, TRIANGLE) for 
computing the geometry(s)
+   * centroid
+   */
+  public TYPE getHighestDimension() {
+return shapeComparator.getHighestDimension();
+  }
+
+  private BytesRef computeBinaryValue(List 
tessellation) {
+try {
+  // dfs order serialization
+  List dfsSerialized = new ArrayList<>(tessellation.size());
+  buildTree(tessellation, dfsSerialized);
+  Writer w = new Writer(dfsSerialized);
+  return w.getBytesRef();
+} catch (IOException e) {
+  throw new RuntimeException("Intern

[GitHub] [lucene] iverase commented on pull request #1017: LUCENE-10654: Add new ShapeDocValuesField for LatLonShape and XYShape

iverase commented on PR #1017:
URL: https://github.com/apache/lucene/pull/1017#issuecomment-1194123352

Hey Nick,

Why so much hurry with this change? I appreciate everything you are trying
to do to signal the feature as experimental and adding the version on the doc
value but still it will be nice to have something production ready from the
beginning and this way of developing this exciting and complex feature is
making things harder.

I am very much interested in this change because Elasticsearch has its own
doc value implementation. I would like to eventually move to Lucene doc values
so I want to make sure that the functionalities, the Elasticsearch
implementation currently has, can be preserved when / if it makes sense. In
order to achieve that we would like to donate our code or help with our
experience in this feature as we have been running it in production for 2+
years with success.

The Elasticsearch implementation is pretty much similar to the one you are
proposing, it has three parts, first we add all the information regarding the
extent of the geometry, then we add centroid information and finally we have
what I call the “triangle tree” which is exactly what you described in this
issue. Here are the differences and how we would like to help adding them:

Our geometry extent contains more information than just plain min/max values
of the coordinates that you are proposing. In particular, we capture the
minimum positive value and the maximum negative value for the x coordinate so
we can wrap those bounding boxes around the dateline in the geo case. This
would be my proposal for a Extent object that captures all that information:
https://github.com/iverase/lucene/blob/TriangleTree/lucene/core/src/java/org/apache/lucene/geo/Extent.java
In the case of the centroid, the biggest difference is that we are computing
the centroid using the original geometry. I really like the algorithm you are
proposing but you are working on the encoded space and I am wondering now if
that would work for cartesian, remember that the encoding is not linear so in
that case the centroids might be incorrect.
I would like to discuss the benefits of the way you are encoding those
values with a bit more care too.
Finally the triangle tree is the same idea, it is just the serialisation of
an interval tree that is composed of tessellation elements. One thing I
realised while developing this feature is that it is necessary to be able to
visit the tree in different ways and therefore adding a visitor pattern for the
tree is a big win. Here is our current implementation which can be used as a
good starting point:
https://github.com/iverase/lucene/blob/TriangleTree/lucene/core/src/java/org/apache/lucene/geo/TriangleTreeReader.java

Finally, I would like to propose to initially focus on the data structure
and once we are happy, we can start integrating the functionality, e.g support
for queries and so on. I hope a more structured approach will make sure we get
the right structure.

What do you think?

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta commented on a diff in pull request #80: #79: include parent issue link

2022-07-25 Thread ASF subversion and git services (Jira)



mocobeta commented on code in PR #80:
URL: https://github.com/apache/lucene-jira-archive/pull/80#discussion_r928947072


##
migration/src/jira_util.py:
##
@@ -83,6 +83,15 @@ def extract_assignee(o: dict) -> tuple[str, str]:
 return (name, disp_name)
 
 
+def extract_parent(o: dict) -> tuple[str, str]:
+parent = o["fields"].get("parent")
+if parent:
+key = parent["key"]
+if key:
+return key, f'https://issues.apache.org/jira/browse/{key}'

Review Comment:
   I would just extract the key and make the url on-the-fly at 
`jir2github_import.py` (L120).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10592) Should we build HNSW graph on the fly during indexing



[ 
https://issues.apache.org/jira/browse/LUCENE-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570929#comment-17570929
 ] 

ASF subversion and git services commented on LUCENE-10592:
--

Commit b15bcd11c333a96c043a3cc1e3498b8b09e7d6a2 in lucene's branch 
refs/heads/branch_9x from Mayya Sharipova
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=b15bcd11c33 ]

LUCENE-10592 Strengthen 
TestHnswGraph::testSortedAndUnsortedIndicesReturnSameResults

This test occasionally fails if knn search returns only 1 document
in the index, as we have an assertion that returned doc IDs from
sorted and unsorted index must be different.

This patch ensures that we have many documents in the index, so
that knn search always returns enough results.


> Should we build HNSW graph on the fly during indexing
> -
>
> Key: LUCENE-10592
> URL: https://issues.apache.org/jira/browse/LUCENE-10592
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mayya Sharipova
>Assignee: Mayya Sharipova
>Priority: Minor
> Fix For: 9.4
>
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Currently, when we index vectors for KnnVectorField, we buffer those vectors 
> in memory and on flush during a segment construction we build an HNSW graph.  
> As building an HNSW graph is very expensive, this makes flush operation take 
> a lot of time. This also makes overall indexing performance quite 
> unpredictable (as the number of flushes are defined by memory used, and the 
> presence of concurrent searches), e.g. some indexing operations return almost 
> instantly while others that trigger flush take a lot of time. 
> Building an HNSW graph on the fly as we index vectors allows to avoid this 
> problem, and spread a load of HNSW graph construction evenly during indexing.
> This will also supersede LUCENE-10194



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #60: Invalid unicode character in conversion of comment



mikemccand commented on issue #60:
URL: 
https://github.com/apache/lucene-jira-archive/issues/60#issuecomment-1194092727

   Reminder: once we have the draft migrated test repo public, find some Jira 
issues that have exotic Unicode escapes/characters and confirm that migrated 
properly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand opened a new pull request, #80: #79: include parent issue link



mikemccand opened a new pull request, #80:
URL: https://github.com/apache/lucene-jira-archive/pull/80

   Render the parent link in the `Legacy Jira` section.
   
   I also removed ` details` from `Legacy Jira details` header.  It seemed 
redundant / self-explanatory already.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand merged pull request #76: 61: map Jira priority to legacy-jira-priority, and include votes in the 'Legacy Jira Information' header when it's > 0



mikemccand merged PR #76:
URL: https://github.com/apache/lucene-jira-archive/pull/76


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #79: Carry parent issue over

2022-07-25 Thread ASF subversion and git services (Jira)

mikemccand commented on issue #79:
URL:
https://github.com/apache/lucene-jira-archive/issues/79#issuecomment-1194082464

OK I have a PR; it renders LUCENE-618 opening description like this:

The last GData - Server commit does not build due to a wrong commit.
Yonik did not commit all the files in the diff file. There are several
sources and packages missing.

The diff - file with the date of 26.06.06 should be applied.
--> http://issues.apache.org/jira/browse/LUCENE-598
26.06.06.diff (644 kb)

could any of the lucene committers apply this patch. Yonik is on the way to
Dublin.

Thanks Simon

---
### Legacy Jira

[LUCENE-618](https://issues.apache.org/jira/browse/LUCENE-618) by Simon
Willnauer (@s1monw) on Jun 27 2006, resolved Jun 28 2006
Parent: [LUCENE-598](https://issues.apache.org/jira/browse/LUCENE-598)
Attachments:
[27.06.06.diff](https://raw.githubusercontent.com/apache/lucene-jira-archive/attachments/attachments/LUCENE-618/27.06.06.diff)

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10592) Should we build HNSW graph on the fly during indexing



[ 
https://issues.apache.org/jira/browse/LUCENE-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570912#comment-17570912
 ] 

ASF subversion and git services commented on LUCENE-10592:
--

Commit 2efc204a390044b67bcfb85683d82a9ea2f852a2 in lucene's branch 
refs/heads/main from Mayya Sharipova
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=2efc204a390 ]

LUCENE-10592 Strengthen 
TestHnswGraph::testSortedAndUnsortedIndicesReturnSameResults

This test occasionally fails if knn search returns only 1 document
in the index, as we have an assertion that returned doc IDs from
sorted and unsorted index must be different.

This patch ensures that we have many documents in the index, so
that knn search always returns enough results.


> Should we build HNSW graph on the fly during indexing
> -
>
> Key: LUCENE-10592
> URL: https://issues.apache.org/jira/browse/LUCENE-10592
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mayya Sharipova
>Assignee: Mayya Sharipova
>Priority: Minor
> Fix For: 9.4
>
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Currently, when we index vectors for KnnVectorField, we buffer those vectors 
> in memory and on flush during a segment construction we build an HNSW graph.  
> As building an HNSW graph is very expensive, this makes flush operation take 
> a lot of time. This also makes overall indexing performance quite 
> unpredictable (as the number of flushes are defined by memory used, and the 
> presence of concurrent searches), e.g. some indexing operations return almost 
> instantly while others that trigger flush take a lot of time. 
> Building an HNSW graph on the fly as we index vectors allows to avoid this 
> problem, and spread a load of HNSW graph construction evenly during indexing.
> This will also supersede LUCENE-10194



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta commented on issue #79: Carry parent issue over



mocobeta commented on issue #79:
URL: 
https://github.com/apache/lucene-jira-archive/issues/79#issuecomment-1194074037

   GitHub automatically adds mature links if we link from one issue to another 
issue, but I agree that it'd be good to explicitly mention parent issues.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand opened a new issue, #79: Carry parent issue over



mikemccand opened a new issue, #79:
URL: https://github.com/apache/lucene-jira-archive/issues/79

   Spinoff from #61.
   
   We already carry the other direction (`sub-tasks`).
   
   It looks like 333 issues have a parent.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #61: Should we carry over Jira "labels"?



mikemccand commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1194063267

   OK I looked at all of the fields (listed above) and I think we are done, 
except for this one!:
   
   > I think we really should migrate `parent` in some way, since we already 
migrate the reverse direction (sub-tasks)?
   
   I'll open a separate issue for this and work on a PR.  I think it shouldn't 
be hard.  I plan to just append to the `Legacy Jira Information` header.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand closed issue #61: Should we carry over Jira "labels"?



mikemccand closed issue #61: Should we carry over Jira "labels"?
URL: https://github.com/apache/lucene-jira-archive/issues/61


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #61: Should we carry over Jira "labels"?



mikemccand commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1194061412

   > > Ahh OK, hmm. But we cannot carry over these watches on behalf of all 
users, I assume. Users will have to re-watch the issues they care about (even 
the legacy Jira ones) again?
   > 
   > Yes. We will add comments to each Jira issue to let all watchers know the 
corresponding GitHub issue URL at the last step in the migration. The watchers 
should notice it.
   
   +1.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta commented on issue #61: Should we carry over Jira "labels"?



mocobeta commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1194043812

   > Ahh OK, hmm. But we cannot carry over these watches on behalf of all 
users, I assume. Users will have to re-watch the issues they care about (even 
the legacy Jira ones) again? 
   
   Yes. We will add comments to each Jira issue to let all watchers know the 
corresponding GitHub issue URL at the last step in the migration. The watchers 
should notice it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] apeteri commented on pull request #77: Update account-map.csv.20220722.verified



apeteri commented on PR #77:
URL: 
https://github.com/apache/lucene-jira-archive/pull/77#issuecomment-1194036688

   No worries! Thank you and everyone involved for handling the migration!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand opened a new issue, #78: Draft a summary of how we migrated to GitHub issues



mikemccand opened a new issue, #78:
URL: https://github.com/apache/lucene-jira-archive/issues/78

   Spinoff from #61.
   
   We should try to briefly summarize what we did during the migration, fields 
we chose to leave out, limitations (e.g. if you were not "verified" then you 
are not "mentioned"), etc.  And maybe pointers to the tooling so other projects 
can maybe re-use and improve on this impressive start (thanks to @mocobeta!).  
This would be a nice artifact to leave for future developers / users / Googlers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta merged pull request #77: Update account-map.csv.20220722.verified



mocobeta merged PR #77:
URL: https://github.com/apache/lucene-jira-archive/pull/77


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta commented on pull request #77: Update account-map.csv.20220722.verified



mocobeta commented on PR #77:
URL: 
https://github.com/apache/lucene-jira-archive/pull/77#issuecomment-1194030016

   Thanks @apeteri for verifying it. We couldn't manually check all candidate 
accounts.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #61: Should we carry over Jira "labels"?



mikemccand commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1194029781

   > > You can watch at the repo level, but not on individual issues?
   > 
   > You can watch/unwatch particular issues/PRs by Subscribe/Unsubscribe the 
issue/PR. (a button with a bell icon is placed in the right panel for that.)
   
   Ahh OK, hmm.  But we cannot carry over these watches on behalf of all users, 
I assume.  Users will have to re-watch the issues they care about (even the 
legacy Jira ones) again?  We should note this on the "Release Notes" that we 
send about this migration?  We should advertise the things we chose not to 
migrate.  Hmm, do we have a draft of these release notes?  I'll open a separate 
issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #61: Should we carry over Jira "labels"?



mikemccand commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1194027968

   I think we really should migrate `parent` in some way, since we already 
migrate the reverse direction (sub-tasks)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta commented on issue #61: Should we carry over Jira "labels"?



mocobeta commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1194025672

   > You can watch at the repo level, but not on individual issues?
   
   You can watch/unwatch particular issues/PRs by Subscribe/Unsubscribe the 
issue/PR. (a button with a bell icon is placed in the right panel for that.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] apeteri opened a new pull request, #77: Update account-map.csv.20220722.verified



apeteri opened a new pull request, #77:
URL: https://github.com/apache/lucene-jira-archive/pull/77

   I saw my name in the account map candidates file at this location: 
https://github.com/apache/lucene-jira-archive/blob/1309c3ff9b7815d660484b123b30be1789b672f4/migration/mappings-data/account-map.csv.20220722.232738#L1519


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta commented on issue #61: Should we carry over Jira "labels"?



mocobeta commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1194021106

   > So maybe we leave `watches` behind (do not migrate that field to GitHub)?
   
   I agree with it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #61: Should we carry over Jira "labels"?



mikemccand commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1194019029

   `watches` is a Jira field for people who have explicitly subscribed to 
updates on a Jira issue.
   
   I don't think GitHub has the same capability?  You can watch at the repo 
level, but not on individual issues?
   
   So maybe we leave `watches` behind (do not migrate that field to GitHub)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta commented on pull request #76: 61: map Jira priority to legacy-jira-priority, and include votes in the 'Legacy Jira Information' header when it's > 0



mocobeta commented on PR #76:
URL: 
https://github.com/apache/lucene-jira-archive/pull/76#issuecomment-1194018160

   Please feel free to merge this. I cannot run/test this right now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #61: Should we carry over Jira "labels"?



mikemccand commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1194005927

   Woops, I failed to link the PR properly: 
https://github.com/apache/lucene-jira-archive/pull/76


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #29: Can/should we make Jira read-only on migration to GitHub issues?



mikemccand commented on issue #29:
URL: 
https://github.com/apache/lucene-jira-archive/issues/29#issuecomment-1194003681

   Maybe we just open an INFRA issue now, but make it clear not to actually do 
it yet, to get it on their radar?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #29: Can/should we make Jira read-only on migration to GitHub issues?



mikemccand commented on issue #29:
URL: 
https://github.com/apache/lucene-jira-archive/issues/29#issuecomment-1194003309

   Have we confirmed that INFRA is able to make the whole Jira project 
read-only after we are done migrating?  Is it just a matter of opening a ticket 
and it's quick/easy?  From my (limited) online digging I am not so sure it is 
easy ;)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand opened a new pull request, #76: 61: map Jira priority to legacy-jira-priority, and include votes in the 'Legacy Jira Information' header when it's > 0



mikemccand opened a new pull request, #76:
URL: https://github.com/apache/lucene-jira-archive/pull/76

   I ran on one issue that had votes and peeked at the output JSON and it looks 
correct!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #61: Should we carry over Jira "labels"?



mikemccand commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1193991508

   > > Hmm how about `legacy-jira-priority`?
   > 
   > I'm fine with it if it's needed.
   > 
   > > If we carried over `legacy-jira-votes` how would we search it? Could we 
sort by it?
   > 
   > I don't think issues can be sorted by labels (for now) or it's useful for 
filtering purposes. I would just log it in the issues' "Jira Information" 
section - if we want to convert it into labels in the future, we can add issue 
labels anytime by GitHub APIs.
   
   +1, OK I'll make a PR for both.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta commented on a diff in pull request #75: Update account-map.csv.20220722.verified



mocobeta commented on code in PR #75:
URL: https://github.com/apache/lucene-jira-archive/pull/75#discussion_r928823975


##
migration/mappings-data/account-map.csv.20220722.verified:
##
@@ -169,3 +169,4 @@ mharwood,markharwood,Mark Harwood
 hossman,hossman,Chris M. Hostetter
 munendrasn,munendrasn,Munendra S N
 vajda,ovalhub,Andi Vajda
+manish1982,manishbafna,Manish

Review Comment:
   ```suggestion
   manish82,manishbafna,Manish
   ```
   I suggest this change - @manishbafna if you are fine with it I'll commit 
this in.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta commented on a diff in pull request #75: Update account-map.csv.20220722.verified



mocobeta commented on code in PR #75:
URL: https://github.com/apache/lucene-jira-archive/pull/75#discussion_r928803845


##
migration/mappings-data/account-map.csv.20220722.verified:
##
@@ -169,3 +169,4 @@ mharwood,markharwood,Mark Harwood
 hossman,hossman,Chris M. Hostetter
 munendrasn,munendrasn,Munendra S N
 vajda,ovalhub,Andi Vajda
+manish1982,manishbafna,Manish

Review Comment:
   We don't see the username `manish1982` in Lucene Jira, I wonder if you mean 
`manish82`?
   ```
   manish82,Manish
   ```
   https://issues.apache.org/jira/secure/ViewProfile.jspa?name=manish82



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta commented on a diff in pull request #75: Update account-map.csv.20220722.verified



mocobeta commented on code in PR #75:
URL: https://github.com/apache/lucene-jira-archive/pull/75#discussion_r928803845


##
migration/mappings-data/account-map.csv.20220722.verified:
##
@@ -169,3 +169,4 @@ mharwood,markharwood,Mark Harwood
 hossman,hossman,Chris M. Hostetter
 munendrasn,munendrasn,Munendra S N
 vajda,ovalhub,Andi Vajda
+manish1982,manishbafna,Manish

Review Comment:
   We don't see the username `manish1982` in Lucene Jira, I wonder if you mean 
`manish82`?
   ```
   manish82,Manish
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta commented on issue #61: Should we carry over Jira "labels"?



mocobeta commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1193953267

   > Hmm how about `legacy-jira-priority`?
   
   I'm fine with it if it's needed.
   
   > If we carried over `legacy-jira-votes` how would we search it? Could we 
sort by it?
   
   I don't think issues can be sorted by labels (for now) or it's useful for 
filtering purposes. I would just log it in the issues' "Jira Information" 
section - if we want to convert it into labels in the future, we can add issue 
labels anytime by GitHub APIs.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #61: Should we carry over Jira "labels"?



mikemccand commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1193931342

   > I once considered/looked at it and decided not to carry them over to 
GitHub - I would do nothing for it if you are ok with that.
   
   Yeah +1.  They drop even below my signal/noise threshold ;)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #61: Should we carry over Jira "labels"?



mikemccand commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1193930893

   > I didn't know that, but I noticed we (reporters? committers?) can set an 
arbitrary username to Reporter fields in Jira.
   
   LOL I did not know either!  Furthermore, you can change it after the fact!  
Maybe `creator` cannot be updated but `reporter` can?  Must be for an "on 
behalf of" sort of situation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #61: Should we carry over Jira "labels"?



mikemccand commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1193930028

   Does GitHub have a way to vote (+1) on issues?
   
   Also, do GitHub labels support numeric types?
   
   If we carried over `legacy-jira-votes` how would we search it?  Could we 
sort by it?
   
   Vote distribution:
   
   ```
   Votes:
   0: 9802
   1: 545
   2: 142
   3: 63
   4: 25
   5: 22
   6: 10
   8: 7
   7: 6
   12: 5
   11: 4
   9: 3
   14: 2
   10: 2
   13: 1
   22: 1
   19: 1
   28: 1
   16: 1
   36: 1
   15: 1
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] manishbafna opened a new pull request, #75: Update account-map.csv.20220722.verified



manishbafna opened a new pull request, #75:
URL: https://github.com/apache/lucene-jira-archive/pull/75

   My account info added


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #61: Should we carry over Jira "labels"?



mikemccand commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1193923036

   Hmm how about `legacy-jira-priority`?
   
   ```
   python print_priority.py
   Major 6182
   Minor 3540
   Trivial 604
   Blocker 202
   Critical 117
   ```
   
   E.g. "how many times did an issue with merging block a release"?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta commented on issue #1: Fix markup conversion error



mocobeta commented on issue #1:
URL: 
https://github.com/apache/lucene-jira-archive/issues/1#issuecomment-1193898346

   > Rather than having everyone make a PR / commit a change, I lower the 
barrier, maybe just allow them to reply to the email? I volunteer to go through 
all replies and carry them over to the mapping file.
   
   Thanks for your suggestion, I sent an e-mail to the dev list.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta commented on issue #61: Should we carry over Jira "labels"?



mocobeta commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1193881378

   > > I think "Lucene Fields" has only two values - "New" and "Patch 
Available".
   > 
   > OK let's maybe not carry those over?
   
   I once considered/looked at it and decided not to carry them over to GitHub 
- I would do nothing for it if you are ok with that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-10557) Migrate to GitHub issue from Jira



 [ 
https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomoko Uchida updated LUCENE-10557:
---
Reporter: Tomoko Uchida  (was: Michael McCandless)

> Migrate to GitHub issue from Jira
> -
>
> Key: LUCENE-10557
> URL: https://issues.apache.org/jira/browse/LUCENE-10557
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Major
> Attachments: Screen Shot 2022-06-29 at 11.02.35 AM.png, 
> image-2022-06-29-13-36-57-365.png, screenshot-1.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> A few (not the majority) Apache projects already use the GitHub issue instead 
> of Jira. For example,
> Airflow: [https://github.com/apache/airflow/issues]
> BookKeeper: [https://github.com/apache/bookkeeper/issues]
> So I think it'd be technically possible that we move to GitHub issue. I have 
> little knowledge of how to proceed with it, I'd like to discuss whether we 
> should migrate to it, and if so, how to smoothly handle the migration.
> The major tasks would be:
>  * (/) Get a consensus about the migration among committers
>  * (/) Choose issues that should be moved to GitHub - We'll migrate all 
> issues towards an atomic switch to GitHub if no major technical obstacles 
> show up.
>  ** Discussion thread 
> [https://lists.apache.org/thread/1p3p90k5c0d4othd2ct7nj14bkrxkr12]
>  ** -Conclusion for now: We don't migrate any issues. Only new issues should 
> be opened on GitHub.-
>  ** Write a prototype migration script - the decision could be made on that. 
> Things to consider:
>  *** version numbers - labels or milestones?
>  *** add a comment/ prepend a link to the source Jira issue on github side,
>  *** add a comment/ prepend a link on the jira side to the new issue on 
> github side (for people who access jira from blogs, mailing list archives and 
> other sources that will have stale links),
>  *** convert cross-issue automatic links in comments/ descriptions (as 
> suggested by Robert),
>  *** strategy to deal with sub-issues (hierarchies),
>  *** maybe prefix (or postfix) the issue title on github side with the 
> original LUCENE-XYZ key so that it is easier to search for a particular issue 
> there?
>  *** how to deal with user IDs (author, reporter, commenters)? Do they have 
> to be github users? Will information about people not registered on github be 
> lost?
>  *** create an extra mapping file of old-issue-new-issue URLs for any 
> potential future uses.
>  *** what to do with issue numbers in git/svn commits? These could be 
> rewritten but it'd change the entire git history tree - I don't think this is 
> practical, while doable.
> * Prepare a complete migration tool
> ** See https://github.com/apache/lucene-jira-archive/issues/5 
> * Build the convention for issue label/milestone management
>  ** See [https://github.com/apache/lucene-jira-archive/issues/6]
>  ** Do some experiments on a sandbox repository 
> [https://github.com/mocobeta/sandbox-lucene-10557]
>  ** Make documentation for metadata (label/milestone) management 
>  * (/) Enable Github issue on the lucene's repository
>  ** Raise an issue on INFRA
>  ** (Create an issue-only private repository for sensitive issues if it's 
> needed and allowed)
>  ** Set a mail hook to 
> [issues@lucene.apache.org|mailto:issues@lucene.apache.org] (many thanks to 
> the general mail group name)
>  * Set a schedule for migration
>  ** See [https://github.com/apache/lucene-jira-archive/issues/7]
>  ** Give some time to committers to play around with issues/labels/milestones 
> before the actual migration
>  ** Make an announcement on the mail lists
>  ** Show some text messages when opening a new Jira issue (in issue template?)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta commented on issue #61: Should we carry over Jira "labels"?



mocobeta commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1193872039

   > I wonder what is the difference between creator and reporter? Oh I see, I 
(mikemccand) can create an issue but list another user as reporter. Curious ;) 
I wonder how often that has happened.
   
   I didn't know that, but I noticed we (reporters? committers?) can set an 
arbitrary username to `Reporter` fields in Jira.
   ![Screenshot from 2022-07-25 
19-25-11](https://user-images.githubusercontent.com/1825333/180756240-eac5bc0b-c181-437d-b976-9831950362dd.png)
   
   In that case, `reporter` and `creator` is set to different values.
   For example, I changed Reporter field in 
https://issues.apache.org/jira/browse/LUCENE-10557 as follows.
   ![Screenshot from 2022-07-25 
19-29-37](https://user-images.githubusercontent.com/1825333/180756539-47db365b-ef98-434f-8018-ace90fe0e56f.png)
   
   Now, `creator` and `reporter` are different.
   ```
   (.venv) migration $ cat jira-dump/LUCENE-10557.json | jq 
'.fields.creator.name'
   "tomoko"
   (.venv) migration $ cat jira-dump/LUCENE-10557.json | jq 
'.fields.reporter.name'
   "mikemccand"
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #61: Should we carry over Jira "labels"?



mikemccand commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1193871524

   > I think "Lucene Fields" has only two values - "New" and "Patch Available".
   
   OK let's maybe not carry those over?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #1: Fix markup conversion error



mikemccand commented on issue #1:
URL: 
https://github.com/apache/lucene-jira-archive/issues/1#issuecomment-1193870585

   I suggest also sending a separate email asking for people to state their 
GitHub id / Jira id mapping, if they are comfortable doing so?
   
   Rather than having everyone make a PR / commit a change, I lower the 
barrier, maybe just allow them to reply to the email?  I volunteer to go 
through all replies and carry them over to the mapping file.
   
   I still wonder/wish we could use Apache's LDAP server behind `id.apache.org` 
-- it sometimes knows the GitHub id of committers.  Hmm but I'm not sure if it 
knows the Jira id?  TooManyIDsException!!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-10557) Migrate to GitHub issue from Jira



 [ 
https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomoko Uchida updated LUCENE-10557:
---
Reporter: Tomoko Uchida  (was: mike Ma)

> Migrate to GitHub issue from Jira
> -
>
> Key: LUCENE-10557
> URL: https://issues.apache.org/jira/browse/LUCENE-10557
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Major
> Attachments: Screen Shot 2022-06-29 at 11.02.35 AM.png, 
> image-2022-06-29-13-36-57-365.png, screenshot-1.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> A few (not the majority) Apache projects already use the GitHub issue instead 
> of Jira. For example,
> Airflow: [https://github.com/apache/airflow/issues]
> BookKeeper: [https://github.com/apache/bookkeeper/issues]
> So I think it'd be technically possible that we move to GitHub issue. I have 
> little knowledge of how to proceed with it, I'd like to discuss whether we 
> should migrate to it, and if so, how to smoothly handle the migration.
> The major tasks would be:
>  * (/) Get a consensus about the migration among committers
>  * (/) Choose issues that should be moved to GitHub - We'll migrate all 
> issues towards an atomic switch to GitHub if no major technical obstacles 
> show up.
>  ** Discussion thread 
> [https://lists.apache.org/thread/1p3p90k5c0d4othd2ct7nj14bkrxkr12]
>  ** -Conclusion for now: We don't migrate any issues. Only new issues should 
> be opened on GitHub.-
>  ** Write a prototype migration script - the decision could be made on that. 
> Things to consider:
>  *** version numbers - labels or milestones?
>  *** add a comment/ prepend a link to the source Jira issue on github side,
>  *** add a comment/ prepend a link on the jira side to the new issue on 
> github side (for people who access jira from blogs, mailing list archives and 
> other sources that will have stale links),
>  *** convert cross-issue automatic links in comments/ descriptions (as 
> suggested by Robert),
>  *** strategy to deal with sub-issues (hierarchies),
>  *** maybe prefix (or postfix) the issue title on github side with the 
> original LUCENE-XYZ key so that it is easier to search for a particular issue 
> there?
>  *** how to deal with user IDs (author, reporter, commenters)? Do they have 
> to be github users? Will information about people not registered on github be 
> lost?
>  *** create an extra mapping file of old-issue-new-issue URLs for any 
> potential future uses.
>  *** what to do with issue numbers in git/svn commits? These could be 
> rewritten but it'd change the entire git history tree - I don't think this is 
> practical, while doable.
> * Prepare a complete migration tool
> ** See https://github.com/apache/lucene-jira-archive/issues/5 
> * Build the convention for issue label/milestone management
>  ** See [https://github.com/apache/lucene-jira-archive/issues/6]
>  ** Do some experiments on a sandbox repository 
> [https://github.com/mocobeta/sandbox-lucene-10557]
>  ** Make documentation for metadata (label/milestone) management 
>  * (/) Enable Github issue on the lucene's repository
>  ** Raise an issue on INFRA
>  ** (Create an issue-only private repository for sensitive issues if it's 
> needed and allowed)
>  ** Set a mail hook to 
> [issues@lucene.apache.org|mailto:issues@lucene.apache.org] (many thanks to 
> the general mail group name)
>  * Set a schedule for migration
>  ** See [https://github.com/apache/lucene-jira-archive/issues/7]
>  ** Give some time to committers to play around with issues/labels/milestones 
> before the actual migration
>  ** Make an announcement on the mail lists
>  ** Show some text messages when opening a new Jira issue (in issue template?)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-10557) Migrate to GitHub issue from Jira



 [ 
https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomoko Uchida updated LUCENE-10557:
---
Reporter: Michael McCandless  (was: Tomoko Uchida)

> Migrate to GitHub issue from Jira
> -
>
> Key: LUCENE-10557
> URL: https://issues.apache.org/jira/browse/LUCENE-10557
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Michael McCandless
>Assignee: Tomoko Uchida
>Priority: Major
> Attachments: Screen Shot 2022-06-29 at 11.02.35 AM.png, 
> image-2022-06-29-13-36-57-365.png, screenshot-1.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> A few (not the majority) Apache projects already use the GitHub issue instead 
> of Jira. For example,
> Airflow: [https://github.com/apache/airflow/issues]
> BookKeeper: [https://github.com/apache/bookkeeper/issues]
> So I think it'd be technically possible that we move to GitHub issue. I have 
> little knowledge of how to proceed with it, I'd like to discuss whether we 
> should migrate to it, and if so, how to smoothly handle the migration.
> The major tasks would be:
>  * (/) Get a consensus about the migration among committers
>  * (/) Choose issues that should be moved to GitHub - We'll migrate all 
> issues towards an atomic switch to GitHub if no major technical obstacles 
> show up.
>  ** Discussion thread 
> [https://lists.apache.org/thread/1p3p90k5c0d4othd2ct7nj14bkrxkr12]
>  ** -Conclusion for now: We don't migrate any issues. Only new issues should 
> be opened on GitHub.-
>  ** Write a prototype migration script - the decision could be made on that. 
> Things to consider:
>  *** version numbers - labels or milestones?
>  *** add a comment/ prepend a link to the source Jira issue on github side,
>  *** add a comment/ prepend a link on the jira side to the new issue on 
> github side (for people who access jira from blogs, mailing list archives and 
> other sources that will have stale links),
>  *** convert cross-issue automatic links in comments/ descriptions (as 
> suggested by Robert),
>  *** strategy to deal with sub-issues (hierarchies),
>  *** maybe prefix (or postfix) the issue title on github side with the 
> original LUCENE-XYZ key so that it is easier to search for a particular issue 
> there?
>  *** how to deal with user IDs (author, reporter, commenters)? Do they have 
> to be github users? Will information about people not registered on github be 
> lost?
>  *** create an extra mapping file of old-issue-new-issue URLs for any 
> potential future uses.
>  *** what to do with issue numbers in git/svn commits? These could be 
> rewritten but it'd change the entire git history tree - I don't think this is 
> practical, while doable.
> * Prepare a complete migration tool
> ** See https://github.com/apache/lucene-jira-archive/issues/5 
> * Build the convention for issue label/milestone management
>  ** See [https://github.com/apache/lucene-jira-archive/issues/6]
>  ** Do some experiments on a sandbox repository 
> [https://github.com/mocobeta/sandbox-lucene-10557]
>  ** Make documentation for metadata (label/milestone) management 
>  * (/) Enable Github issue on the lucene's repository
>  ** Raise an issue on INFRA
>  ** (Create an issue-only private repository for sensitive issues if it's 
> needed and allowed)
>  ** Set a mail hook to 
> [issues@lucene.apache.org|mailto:issues@lucene.apache.org] (many thanks to 
> the general mail group name)
>  * Set a schedule for migration
>  ** See [https://github.com/apache/lucene-jira-archive/issues/7]
>  ** Give some time to committers to play around with issues/labels/milestones 
> before the actual migration
>  ** Make an announcement on the mail lists
>  ** Show some text messages when opening a new Jira issue (in issue template?)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-10557) Migrate to GitHub issue from Jira



 [ 
https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomoko Uchida updated LUCENE-10557:
---
Reporter: mike Ma  (was: Tomoko Uchida)

> Migrate to GitHub issue from Jira
> -
>
> Key: LUCENE-10557
> URL: https://issues.apache.org/jira/browse/LUCENE-10557
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: mike Ma
>Assignee: Tomoko Uchida
>Priority: Major
> Attachments: Screen Shot 2022-06-29 at 11.02.35 AM.png, 
> image-2022-06-29-13-36-57-365.png, screenshot-1.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> A few (not the majority) Apache projects already use the GitHub issue instead 
> of Jira. For example,
> Airflow: [https://github.com/apache/airflow/issues]
> BookKeeper: [https://github.com/apache/bookkeeper/issues]
> So I think it'd be technically possible that we move to GitHub issue. I have 
> little knowledge of how to proceed with it, I'd like to discuss whether we 
> should migrate to it, and if so, how to smoothly handle the migration.
> The major tasks would be:
>  * (/) Get a consensus about the migration among committers
>  * (/) Choose issues that should be moved to GitHub - We'll migrate all 
> issues towards an atomic switch to GitHub if no major technical obstacles 
> show up.
>  ** Discussion thread 
> [https://lists.apache.org/thread/1p3p90k5c0d4othd2ct7nj14bkrxkr12]
>  ** -Conclusion for now: We don't migrate any issues. Only new issues should 
> be opened on GitHub.-
>  ** Write a prototype migration script - the decision could be made on that. 
> Things to consider:
>  *** version numbers - labels or milestones?
>  *** add a comment/ prepend a link to the source Jira issue on github side,
>  *** add a comment/ prepend a link on the jira side to the new issue on 
> github side (for people who access jira from blogs, mailing list archives and 
> other sources that will have stale links),
>  *** convert cross-issue automatic links in comments/ descriptions (as 
> suggested by Robert),
>  *** strategy to deal with sub-issues (hierarchies),
>  *** maybe prefix (or postfix) the issue title on github side with the 
> original LUCENE-XYZ key so that it is easier to search for a particular issue 
> there?
>  *** how to deal with user IDs (author, reporter, commenters)? Do they have 
> to be github users? Will information about people not registered on github be 
> lost?
>  *** create an extra mapping file of old-issue-new-issue URLs for any 
> potential future uses.
>  *** what to do with issue numbers in git/svn commits? These could be 
> rewritten but it'd change the entire git history tree - I don't think this is 
> practical, while doable.
> * Prepare a complete migration tool
> ** See https://github.com/apache/lucene-jira-archive/issues/5 
> * Build the convention for issue label/milestone management
>  ** See [https://github.com/apache/lucene-jira-archive/issues/6]
>  ** Do some experiments on a sandbox repository 
> [https://github.com/mocobeta/sandbox-lucene-10557]
>  ** Make documentation for metadata (label/milestone) management 
>  * (/) Enable Github issue on the lucene's repository
>  ** Raise an issue on INFRA
>  ** (Create an issue-only private repository for sensitive issues if it's 
> needed and allowed)
>  ** Set a mail hook to 
> [issues@lucene.apache.org|mailto:issues@lucene.apache.org] (many thanks to 
> the general mail group name)
>  * Set a schedule for migration
>  ** See [https://github.com/apache/lucene-jira-archive/issues/7]
>  ** Give some time to committers to play around with issues/labels/milestones 
> before the actual migration
>  ** Make an announcement on the mail lists
>  ** Show some text messages when opening a new Jira issue (in issue template?)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #1: Fix markup conversion error



mikemccand commented on issue #1:
URL: 
https://github.com/apache/lucene-jira-archive/issues/1#issuecomment-1193860507

   Thanks @mocobeta -- the conversions are looking great -- I spot checked 
around a dozen issues yesterday.
   
   Once you have the new full migration done, I suggest sending an email to the 
dev list to call everyone's attention to it, and set a time box (two or three 
days?).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mocobeta commented on issue #1: Fix markup conversion error



mocobeta commented on issue #1:
URL: 
https://github.com/apache/lucene-jira-archive/issues/1#issuecomment-1193852170

   > With a few further small fixes, I'll run a full migration next week once 
again. It will hopefully be the final iteration, I'll make it publicly 
available to let others check/investigate the result.
   
   I started a (hopefully final) rehearsal to walk through the whole steps 
described in #7 with accumulated improvements in the migration scripts. Once it 
is finished, I'll share the test repository to manually check/look at the 
migration result. I don't come up with a systematic methodology for that 
though, we could randomly pick issues with complex markups, attachments, or 
various hyperlinks and then compare them to the original Jira issues.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #61: Should we carry over Jira "labels"?



mikemccand commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1193843373

   Heh, 13 times too:
   
   ```
   issue LUCENE-288, fields["reporter"]["name"]='vajda', 
fields["creator"]["name"]='a...@osafoundation.org'
   issue LUCENE-7726, fields["reporter"]["name"]='hossman', 
fields["creator"]["name"]='uschindler'
   issue LUCENE-4864, fields["reporter"]["name"]='mpoindexter', 
fields["creator"]["name"]='uschindler'
   issue LUCENE-5169, fields["reporter"]["name"]='jpountz', 
fields["creator"]["name"]='watuki'
   issue LUCENE-5056, fields["reporter"]["name"]='hdeadman', 
fields["creator"]["name"]='dsmiley'
   issue LUCENE-672, fields["reporter"]["name"]='ysee...@gmail.com', 
fields["creator"]["name"]='ningli'
   issue LUCENE-1264, fields["reporter"]["name"]='bmargulies', 
fields["creator"]["name"]='bimargulies'
   issue LUCENE-346, fields["reporter"]["name"]='cutting', 
fields["creator"]["name"]='cutt...@apache.org'
   issue LUCENE-6673, fields["reporter"]["name"]='dancollins', 
fields["creator"]["name"]='andyetitmoves'
   issue LUCENE-228, fields["reporter"]["name"]='vajda', 
fields["creator"]["name"]='a...@osafoundation.org'
   issue LUCENE-289, fields["reporter"]["name"]='vajda', 
fields["creator"]["name"]='a...@osafoundation.org'
   issue LUCENE-359, fields["reporter"]["name"]='cutting', 
fields["creator"]["name"]='cutt...@apache.org'
   issue LUCENE-1, fields["reporter"]["name"]='cutting', 
fields["creator"]["name"]='cutt...@apache.org'
   ```
   
   Looks like we use `reporter` not `creator`, perfect.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on issue #61: Should we carry over Jira "labels"?



mikemccand commented on issue #61:
URL: 
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1193837603

   I wonder what is the difference between `creator` and `reporter`?  Oh I see, 
I (`mikemccand`) can create an issue but list another user as `reporter`.  
Curious ;)  I wonder how often that has happened.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-jira-archive] mikemccand commented on pull request #74: Polish a few sharp edges that hit me when running remap_cross_issue_links.py