[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-10-04 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15544807#comment-15544807
 ] 

Adrien Grand commented on LUCENE-7453:
--

Sorry, I did a typo in the commit message, this commit is related to 
LUCENE-7463 not LUCENE-7453.

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-10-04 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15544805#comment-15544805
 ] 

ASF subversion and git services commented on LUCENE-7453:
-

Commit 32446e9205679fb94b247f0fa2aa97ecd54a49ff in lucene-solr's branch 
refs/heads/master from [~jpountz]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=32446e9 ]

LUCENE-7453: Create a Lucene70Codec.


> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-22 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513380#comment-15513380
 ] 

Yonik Seeley commented on LUCENE-7453:
--

I don't think changing the name really helps a new user understand what a docid 
actually is, and the safe ways to use one - that's the much harder part.
The fact that it's transient in a sense (but still cacheable for the lifetime 
of a reader), local to a segment (one has to understand segments and the fact 
that they are mostly immutable), the fact that you *can* reuse one on a 
different view of the same segment (deleted docs), etc.

This naming discussion would have been appropriate during the initial naming 
perhaps, but now a rename would inflict guaranteed pain on all existing devs / 
documentation / books / blogs, ec., all to attempt to safe a few *seconds* of 
new user confusion out of the necessary *days/weeks* of total confusion 
necessary to build a mental model of how Lucene actually works.  In fact, it 
may be just as likely to cause confusion if the new user is using any 
out-of-date resources that use the old terminology.  It sounds like a poor 
trade-off to rename now.


> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Paul Elschot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15511145#comment-15511145
 ] 

Paul Elschot commented on LUCENE-7453:
--

bq. But the seg examples you have still have docid, just with seg prepended. It 
still has the problem that it uses "id", when id means identifier,

This is meant as an identifier for a document within a segment; in a segment 
this identifier is permanent, and the only one.

For compound readers there are multiple segments, and also in that case adding 
seg to the name is correct.


> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15511085#comment-15511085
 ] 

Dawid Weiss commented on LUCENE-7453:
-

Hoss, consider the possibilities of using non-ascii unicode letters (valid Java 
identifiers). Perfectly fine to call it {{docλ}}. Or even better, just {{ℵ}}. 
{{ℵSet}}. {{ℵIterator}}. Think of it, wouldn't that be truly unique.

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Mikhail Khludnev (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15511065#comment-15511065
 ] 

Mikhail Khludnev commented on LUCENE-7453:
--

Whatever you call name, let's have _dixi_ in code! Please!

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Ryan Ernst (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15511049#comment-15511049
 ] 

Ryan Ernst commented on LUCENE-7453:


bq. this is probably a multicriteria optimisation problem with pareto-optimal 
set of solutions, rather than a single one...

That is why I'm ok with "docIndex" here, it just isn't my favorite. But it is 
still better than docid.

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15511047#comment-15511047
 ] 

Hoss Man commented on LUCENE-7453:
--

We could just call it the {{huperDuperEphemeralPositionInIndex}}

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Ryan Ernst (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15511042#comment-15511042
 ] 

Ryan Ernst commented on LUCENE-7453:


bq. I don't like overloading index for this, especially in the class names, so 
for now I'd prefer the segment variants in second column.

But the seg examples you have still have docid, just with seg prepended. It 
still has the problem that it uses "id", when id means identifier, which is 
usually an opaque string. {{docnum}} to me is still the best, the document 
number in the segment (where "in the segment is implied", although if we want 
SegDocNum, I guess it'd be ok, just more wordy).

bq. Anyway, we could use the opportunity to shorten some of the longer names.

+1

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15511028#comment-15511028
 ] 

Dawid Weiss commented on LUCENE-7453:
-

bq. [...] i meant the overloading of "index", because it is used in so many 
ways (inverted index and array index, which are obviously similar, but very 
different things, one is a collection of data structures in files, and the 
other is a number).

That the term is overloaded with meanings doesn't mean this particular use case 
isn't appropriate. 

Looking at Paul's table the {{index}} column still looks semantically most 
clear to my eyes. What I like in particular is that there is no notion of the 
"index" belonging to a segment (as {{SegDocIdSet}} would imply). Rather, like I 
said before, the doc index is a logical index of a document within an index 
reader (whether it's a leaf reader or a composite doesn't matter). It nicely 
fits in loops created for {{reader.maxDoc}}, for example.

This said, everyone will have their favorites; this is probably a multicriteria 
optimisation problem with pareto-optimal set of solutions, rather than a single 
one...


> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Paul Elschot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15510965#comment-15510965
 ] 

Paul Elschot commented on LUCENE-7453:
--

I tried an alternative that adds an variation of segment wherever docID is used 
in some form.

Here is an overview of renaming possibilities for core/src/java, in three 
column python strings.

The first column contains the current name, the second column a segment 
variant, the third column an index variant.
Please assume an appropriate amount of question marks (??) in the second and 
third columns.

{code}
classFileRenames = """

DocIdSet SegDocIdSet  DocIndexSet
DocIdSetIterator SegDocIdSetIterator  
DocIndexIterator
ConjunctionDISI  ConjunctionSegDisi   
ConjunctionDixi
DisjunctionDISIApproximation DisjunctionSegDisiApproximation  
DisjunctionDixiApproximation
DisiPriorityQueueSegDisiPriorityQueue 
DixiPriorityQueue
DisiWrapper  SegDisiWrapper   DixiWrapper
FilteredDocIdSetIterator FilteredSegDisi  FilteredDixi
DocIdSetBuilder  SegDocIdSetBuilder   
DocIndexSetBuilder
RoaringDocIdSet  RoaringSegDocIdSet   
RoaringDocIndexSet
IntArrayDocIdSet IntArraySegDocIdSet  
IntArrayDocIndexSet
NotDocIdSet  NotSegDocIdSet   NotDocIndexSet
BitDocIdSet  BitSegDocIdSet   BitDocIndexSet
DocIdsWriter SegDocIdsWriter  
DocIndexesWriter
DocIdMerger  SegDocIdMerger   DocIndexMerger
"""

identifierRenames = classFileRenames + """

TwoPhaseIteratorAsDocIdSetIterator TwoPhaseIteratorAsSegDocIdSetIterator 
TwoPhaseIteratorAsDocIndexIterator
BitSetConjunctionDISI  BitSetConjunctionDisi 
BitSetConjunctionDisi
IntArrayDocIdSetIterator   IntArraySegDocIdSetIterator   
IntArrayDocIndexIterator

asDocIdSetIterator asSegDocIdSetIterator 
asDocIndexIterator
getDocId   getSegDocId   
getDocIndex
docID  sdocID
docIndex

docID  sdocIDdocIdx
docId  sdocIddocIdx
docIDs sdocIDs   docIdxs
docIds sdocIds   docIdxs
disi   sdisi dixi
docIdSet   sDocIdSet 
docIndexSet

"""
{code}

(The identifiers here are for local classes, methods and variables.)

I don't like overloading index for this, especially in the class names, so for 
now I'd prefer the segment variants in second column.

Anyway, we could use the opportunity to shorten some of the longer names.


> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15510820#comment-15510820
 ] 

Yonik Seeley commented on LUCENE-7453:
--


Indeed, "docIndex" can also be read as "an index of documents" (just like term 
index is an index of terms).
docOrd is another option, but I don't like it.

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Ryan Ernst (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15510784#comment-15510784
 ] 

Ryan Ernst commented on LUCENE-7453:


bq. although I do dislike using such an overloaded term

Just to be perfectly clear about this statement, i meant the overloading of 
"index", because it is used in so many ways (inverted index and array index, 
which are obviously similar, but very different things, one is a collection of 
data structures in files, and the other is a number).

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Ryan Ernst (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15510719#comment-15510719
 ] 

Ryan Ernst commented on LUCENE-7453:


I'm ok with {{docIndex}}, it is at least an improvement over {{docid}} 
(although I do dislike using such an overloaded term).

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509918#comment-15509918
 ] 

Dawid Weiss commented on LUCENE-7453:
-

bq. [...] since everybody who has low-level integration with Lucene probably 
messes up with the DocIdSetIterator

Amen to this. Also, {{DocIndexIterator}} also sounds like a sensible in this 
context, doesn't it?

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509889#comment-15509889
 ] 

Adrien Grand commented on LUCENE-7453:
--

bq. When you think of documents inside a reader as a contiguous "array" of 
documents, the index makes a lot of sense

+1 to this point, this is the analogy I always make in order to explain doc 
ids. I like moving to a name that suggests that the doc id is an index rather 
than an identifier.

Regarding the class renaming, even though these are expert/low-level classes, I 
agree it'd be nicer to do the renaming in a major release since everybody who 
has low-level integration with Lucene probably messes up with the 
DocIdSetIterator class. Unless maybe we can figure a way to make the migration 
easier by either making DocIndexIterator (or whatever the new name would be) a 
parent or sub class of DocIdSetIterator and deprecating DocIdSetIterator?



> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509831#comment-15509831
 ] 

Dawid Weiss commented on LUCENE-7453:
-

bq. Are you fine to get docIndex from IndexReader or IndexSearcher after 
submitting docs to IndexWriter?

Yes, I think so. When you add a document to an IndexWriter you don't get any 
document "id" (or number) anyway. Documents are indexed and made available to 
you once you acquire a new IndexReader -- and then each document will be 
uniquely described with an "index", valid only within this particular 
IndexReader. I think this makes sense, even when you think of methods like 
{{maxDoc}} which could read {{maxDocIndex}}...

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Mikhail Khludnev (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509814#comment-15509814
 ] 

Mikhail Khludnev commented on LUCENE-7453:
--

c'mon. Are you fine to get {{docIndex}} from {{IndexReader}} or 
{{IndexSearcher}} after submitting docs to {{IndexWriter}}? 

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Mikhail Khludnev (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509799#comment-15509799
 ] 

Mikhail Khludnev commented on LUCENE-7453:
--

bq. what is "numbering" of documents?
Numbering means identifying in a sequence. fwiw, it would be great if this 
naming convention allow to distinguish between segment num/id/whateer and 
global ones.

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509734#comment-15509734
 ] 

Uwe Schindler commented on LUCENE-7453:
---

+1 to DocIndex.

I see the problem like [~paul.elsc...@xs4all.nl] that we might need to rename 
some classes, too :(

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509692#comment-15509692
 ] 

David Smiley commented on LUCENE-7453:
--

+1 to docIndex -- nice name.

Presumably we'd change both master & 6x so as to minimize backporting pain 
backporting of other issues?  Also... I wonder if some issues like this one are 
best delayed until when the next major release nears.  Again, to ease 
feature/bug back-porting.

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509458#comment-15509458
 ] 

Dawid Weiss commented on LUCENE-7453:
-

The more I think about it, the more I think docIndex is actually quite all 
right in this context. When you think of documents inside a reader as a 
contiguous "array" of documents, the index makes a lot of sense. The index is 
also not at all associated with the document -- much like an array index tells 
you nothing about the object at that index. A clarified documentation saying 
document indexes remain constant for an open IndexReader (but can change upon 
its reopening!) would make it even better.

Ryan, what do you think?

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509437#comment-15509437
 ] 

Michael McCandless commented on LUCENE-7453:


+1 for {{docIndex}}!

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509280#comment-15509280
 ] 

Dawid Weiss commented on LUCENE-7453:
-

bq. I think docNum is a good improvement because it makes it sounds like we are 
numbering the documents, not assigning a unique identifier to them.

Sorry, but this explanation is even more controversial and vague to me than 
(what is "numbering" of documents?). I'd prefer simply explaining that 
identifiers are persistent within an index segment (because they are), but 
index segments can be merged and thus a document may be moved across index 
segments over time, changing its per-segment identifier. 

If we really wish to make loops like this not use the "id" naming:
{code}
for (int docId = 0, max = indexReader.maxDoc(); docId < max; docId++) {
  // do something
}
{code}

then really {{docNum}} doesn't make it any better. Even {{docIndex}} seems 
better to me; in fact, this "index" makes sense both at segment level (where 
the index doesn't change) and at composite reader level (where the 'index' of a 
document has a more complex semantics). If we make it clear document index is 
volatile and is valid (and constant) only for the a opened reader, then this is 
more clear to me.

{code}
for (int docIndex = 0, max = indexReader.maxDoc(); docIndex < max; docIndex++) {
  // do something
}
{code}



> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509250#comment-15509250
 ] 

Michael McCandless commented on LUCENE-7453:


+1 to rename {{docId}} -> {{docNum}}.

{{id}}, short for {{identifier}}, implies you have a lasting unique identifier, 
a primary key.  This has tripped up a number of users over time, who were 
surprised to see that segment merging unexpectedly shuffles the "ids" they got 
back from searches.

I think {{docNum}} is a good improvement because it makes it sounds like we are 
numbering the documents, not assigning a unique identifier to them.

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-21 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509049#comment-15509049
 ] 

Dawid Weiss commented on LUCENE-7453:
-

I'm in favor of cleaning up wording too -- don't get me wrong -- I just don't 
feel like {{docnum}} is particularly more expressive than {{docid}} and as Paul 
pointed out, there are other places which use the "docid" idiom.

As for risks, Uwe, there are none, obviously, but we have code that uses the 
'docid' in local variables to be consistent with Lucene naming, that's all.

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-20 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15508210#comment-15508210
 ] 

David Smiley commented on LUCENE-7453:
--

bq. To me the difference between docnum and docid is really that docnum is one 
letter longer  Seriously, it doesn't seem to be explaining anything more than 
docid does. It would be more self-explanatory to call it docSegmentIndex, but 
this seems verbose.

I feel the same way -- there is no semantic difference.  

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-20 Thread Paul Elschot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15508004#comment-15508004
 ] 

Paul Elschot commented on LUCENE-7453:
--

I'm in favour, but it is going to be hard:
{code}
public abstract class DocIdSet ...
public abstract class DocIdSetIterator ...
{code}
I vaguely recall not really agreeing with these names, but it is probably too 
late now.



> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-20 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15507786#comment-15507786
 ] 

Uwe Schindler commented on LUCENE-7453:
---

I was the one who suggested that change. Imho the change would only modify 
parameter names and maybe some getter, although they are mostly called doc()  
int iterators. So this would not break anything or require updates in code of 
users. Just naming of parameters and corresponding javadocs. So risk is low.

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7453) Change naming of variables/apis from docid to docnum

2016-09-20 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15507544#comment-15507544
 ] 

Dawid Weiss commented on LUCENE-7453:
-

To me the difference between {{docnum}} and {{docid}} is really that {{docnum}} 
is one letter longer :) Seriously, it doesn't seem to be explaining anything 
more than {{docid}} does. It would be more self-explanatory to call it 
{{segmentIndex}}, but this seems verbose.

Don't you think adding better documentation (in one place and linking to it) 
would be a better idea than just renaming? Also, the nomenclature here has been 
with us for years. I don't see an obvious benefit of switching to {{docnum}} 
for new users and I see how it may be a confusing change to existing 
Lucene-experienced developers (especially if they have their own code that 
would stick to "docid" in local variables, etc.

> Change naming of variables/apis from docid to docnum
> 
>
> Key: LUCENE-7453
> URL: https://issues.apache.org/jira/browse/LUCENE-7453
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>
> In SOLR-9528 a suggestion was made to change {{docid}} to {{docnum}}. The 
> reasoning for this is most notably that {{docid}} has a connotation about a 
> persistent unique identifier (eg like {{_id}} in elasticsearch or {{id}} in 
> solr), while {{docid}} in lucene is currently some local to a segment, and 
> not comparable directly across segments.
> When I first started working on Lucene, I had this same confusion. {{docnum}} 
> is a much better name for this transient, segment local identifier for a doc. 
> Regardless of what solr wants to do in their api (eg keeping _docid_), I 
> think we should switch the lucene apis and variable names to use docnum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org