[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-09-01 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599797#comment-16599797
 ] 

ASF subversion and git services commented on LUCENE-8267:
-

Commit d93c46ea94dec612aa53e37d119fe34b5e8a828e in lucene-solr's branch 
refs/heads/master from [~dsmiley]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d93c46e ]

LUCENE-8267: adjust CHANGES.txt advise


> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (8.0)
>
> Attachments: LUCENE-8267.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-08-31 Thread Simon Willnauer



++

> On 31. Aug 2018, at 17:55, David Smiley (JIRA)  wrote:
> 
> 
>[ 
> https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598935#comment-16598935
>  ] 
> 
> David Smiley commented on LUCENE-8267:
> --
> 
> "... consider removing to use the default or experiment with one of the 
> others."
> 
> Okay Simon?  They will visit this issue and/or dig to see what others exist 
> to make the decision for themselves.
> 
>> Remove memory codecs from the codebase
>> --
>> 
>>Key: LUCENE-8267
>>URL: https://issues.apache.org/jira/browse/LUCENE-8267
>>Project: Lucene - Core
>> Issue Type: Task
>>   Reporter: Dawid Weiss
>>   Assignee: Dawid Weiss
>>   Priority: Major
>>Fix For: master (8.0)
>> 
>>Attachments: LUCENE-8267.patch
>> 
>> Time Spent: 0.5h
>> Remaining Estimate: 0h
>> 
>> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
>> selection of codecs for tests and cause occasional OOMs when a test with 
>> huge data is selected. We don't use those memory codecs anywhere outside of 
>> tests, it has been suggested to just remove them to avoid maintenance costs 
>> and OOMs in tests. [1]
>> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3
> 
> 
> 
> --
> This message was sent by Atlassian JIRA
> (v7.6.3#76005)
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-08-31 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598935#comment-16598935
 ] 

David Smiley commented on LUCENE-8267:
--

"... consider removing to use the default or experiment with one of the others."

Okay Simon?  They will visit this issue and/or dig to see what others exist to 
make the decision for themselves.

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (8.0)
>
> Attachments: LUCENE-8267.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-08-31 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598335#comment-16598335
 ] 

Dawid Weiss commented on LUCENE-8267:
-

It'd be ideal if Solr had a migration.txt file, not just changes.txt. It'd be a 
better fit there. If you insist on having the FST50 mentioned, I'd suggest 
something like:
{code}
* LUCENE-8267: Memory codecs have been removed from the codebase 
(MemoryPostings,
  MemoryDocValues). If you used postingsFormat="Memory" or 
docValuesFormat="Memory", consider
  using the defaults. For in-memory postings, you can try the "FST50" format as 
an alternative 
  to "Memory". (Dawid Weiss)
{code}

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (8.0)
>
> Attachments: LUCENE-8267.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-08-30 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597436#comment-16597436
 ] 

Simon Willnauer commented on LUCENE-8267:
-

I personally don't think we should put FST50 into this message. The message 
links to this issue which has all the discussion. 

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (8.0)
>
> Attachments: LUCENE-8267.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-08-30 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597386#comment-16597386
 ] 

David Smiley commented on LUCENE-8267:
--

Can someone propose new wording here, or is my proposal fine.  Note my proposal 
mentions the default twice, both for postingsFormat and docValuesFormat.  We 
agree that the default codec is an excellent codec.  Remember that someone who 
chooses another one has done so explicitly and is thus aware of the default 
codec already and yet chose something else as a better fit for them.  I want 
the wording to mention FST50 as an option try try; this postingsFormat seems to 
fly under the radar of people's awareness.  Ultimately the user is going to 
have to do their own experiments to make the choice for them.

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (8.0)
>
> Attachments: LUCENE-8267.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-08-30 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597151#comment-16597151
 ] 

Simon Willnauer commented on LUCENE-8267:
-

+1 to use defaults as well.

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (8.0)
>
> Attachments: LUCENE-8267.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-08-30 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597132#comment-16597132
 ] 

Adrien Grand commented on LUCENE-8267:
--

+1 to recommend defaults

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (8.0)
>
> Attachments: LUCENE-8267.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-08-30 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597114#comment-16597114
 ] 

Dawid Weiss commented on LUCENE-8267:
-

I don't have an opinion on this, really. Hardcoding FST50 seems like binding to 
a concrete version? You're probably right that Direct is not the best choice 
though. Perhaps suggest leaving it at the default value like docValues?

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (8.0)
>
> Attachments: LUCENE-8267.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-08-29 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596879#comment-16596879
 ] 

David Smiley commented on LUCENE-8267:
--

[~dweiss] I noticed the solr/CHANGES.txt entry you added recommended users 
switch to "Direct" instead.  I'm surprised we would recommend that 
(especially given the demise of "Memory").  Wouldn't FST50 be better?  I'd like 
to reword the CHANGES.txt to the following:
{noformat}
* LUCENE-8267: Memory codecs have been removed from the codebase 
(MemoryPostings,
  MemoryDocValues). If you used postingsFormat="Memory" switch to "FST50" as 
the next best alternative,
  or use the default.  If you used docValuesFormat="Memory" then remove it to 
get the default. (Dawid Weiss){noformat}

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (8.0)
>
> Attachments: LUCENE-8267.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-05-08 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467071#comment-16467071
 ] 

ASF subversion and git services commented on LUCENE-8267:
-

Commit 85c00e77efdf53f30da6eaffd38c2b016a7805bc in lucene-solr's branch 
refs/heads/master from [~dawid.weiss]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=85c00e7 ]

LUCENE-8267: removed references to memory codecs.


> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Priority: Major
> Fix For: master (8.0)
>
> Attachments: LUCENE-8267.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-05-08 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467064#comment-16467064
 ] 

Dawid Weiss commented on LUCENE-8267:
-

I ran nightly tests three times, but I can't get past Solr tests failing -- 
different tests each time, don't seem to be related to the change (cloud, 
distributed).

I'm committing it in, regardless of those failures.

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-05-07 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465910#comment-16465910
 ] 

Dawid Weiss commented on LUCENE-8267:
-

Removed references to memory postings and memory docvalues. An aggregate of 
changes is here, precommit passes, running tests now.

https://github.com/apache/lucene-solr/pull/372

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-05-04 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463572#comment-16463572
 ] 

Dawid Weiss commented on LUCENE-8267:
-

I was on short holidays, I'll take care of it soon.

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Priority: Major
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-05-02 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461588#comment-16461588
 ] 

David Smiley commented on LUCENE-8267:
--

With the help of others using the SolrTextTagger, we've concluded that the 
speed difference is negligible. I'm glad we've then reached consensus that the 
MemoryPostingsFormat will not be missed! :D

+1 to remove MemoryPostingsFormat & DirectPostingsFormat
{quote}I think filing a JIRA issue is kind of soliciting feedback, don't you 
think?
{quote}
No! At least not beyond our insular world.

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Priority: Major
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-04-23 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448788#comment-16448788
 ] 

Robert Muir commented on LUCENE-8267:
-

{quote}
Thanks for the suggestion to use MMapDirectory.preload, I didn't know about it, 
but that appears to only help warmup, not sustained performance; right?
{quote}

loading stuff into heap memory gives no higher guarantee than doing it this way 
under pressure, it still depends on vm parameters.

{quote}
I get the maintenance aspect but we need community input on such decisions to 
ascertain real-world use.
{quote}

That is not how it works: this is open source. These memory/direct formats 
cause excessive maintenance hassle with the tests. I saw Alan and Dawid 
fighting with them and it seemed clear to me its not worth the trouble. We 
should remove them: the cost is too high.

Someone can always pull in the source code themselves for their esoteric 
use-case: but unless we have *maintainers* coming up then they need to go: this 
doesn't come down to a vote by users.

If you want to make it hard for us to clean up tech debt like this, by -1s and 
so on, thats your choice. But it is also my choice to make it hard to add 
things. 

Trust me, I will make it equally hard to add code as it is to remove code. It 
is the only way to make things sustainable.


> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Priority: Major
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-04-23 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448771#comment-16448771
 ] 

Dawid Weiss commented on LUCENE-8267:
-

bq. I get the maintenance aspect but we need community input on such decisions 
to ascertain real-world use.

I think filing a JIRA issue is kind of soliciting feedback, don't you think? I 
agree with Simon and Robert that there are classes that, while useful, are not 
at the forefront of what a broad "Lucene API" is... We should have the liberty 
to adjust or remove such things. I scanned the code of both Lucene and Solr and 
there were no references (other than in tests) to those classes, so it's not 
just "Lucene land".

Also, given the size and diversity of the Lucene/Solr user community I'm fairly 
confident there will always be somebody who finds something very useful, no 
matter what you'd like to change or remove. Hell, I use a lot of internal 
Lucene infrastructure in my own projects and sometimes I miss things that go 
away myself... (and frequently I just grab the latest source of something and 
copy it over to maintain in my own source tree, that's part of the beauty of 
open source).

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Priority: Major
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-04-23 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448748#comment-16448748
 ] 

David Smiley commented on LUCENE-8267:
--

Ah; I incorrectly assumed the proposal included the FST postings formats but 
apparently not. It's too bad FSTPulsingFormat is long gone though since in the 
text-tagging use-case it'd effectively be a substitute for 
MemoryPostingsFormat. The FSTTermsReader accepts a PostingsReaderBase; maybe 
it's possible to write an in-memory version of PostingsReaderBase, at least for 
the "pulsed" (single posting) case. Nonetheless lets see how the text tagger 
performs with these codec options.

Thanks for the suggestion to use MMapDirectory.preload, I didn't know about it, 
but that appears to only help warmup, not sustained performance; right? And I 
believe even with FileSwitchDirectory, on shutdown files with certain 
extensions would vanish; right?
{quote}So I perceive your veto as an aggressive step. To me it's a last resort 
after we can't find a solution that is good for all of us. The conversation 
already has a tone that is not appropriate and could have been prevented by 
formulating objections as questions. like I am using this postings format in X 
and it's serving well, what are the alternatives. - I am sure you would have 
got an awesome answer.
{quote}
The "sorry" word immediately after my veto was intended to prevent 
misperceptions about tone; I don't mean to be aggressive – sorry!  I agree I 
could have asked for alternatives up-front; I'll try and remember that next 
time. I was thinking my early vote could prevent work that someone does in vein 
to remove these pieces. In retrospect I didn't need to vote yet to accomplish 
that (e.g. convey disagreement with others).  In this way I was trying to offer 
improved communication where from other's I've seen no veto but a confusing 
cloud of doubt as to wether there would be a veto or not (which in my mind is 
worse).  I respect you may feel differently though; just please understand my 
intended tone is not aggressive.
{quote}if you can't remove stuff without others jumping in vetoing the reaction 
will be to prevent additions in the same way due to _fear_  created by the 
veto. This is a terrible place to be in, we have seen this in the past we 
should prevent it.
{quote}
Do you mean if we add some new thingamajig, we might feel that we *have* to 
support it indefinitely?  (I wouldn't use the word "fear" for this; maybe I've 
got your intent wrong still)  Hmmm; I think it's very situationally dependent.  
For example with queryNorm & coords, LUCENE-7347, I had concerns but ultimately 
understood that maintaining these things were making things awkward for us.  
But the PostingsFormats seem different to me.  They conform to our APIs; they 
don't get in the way or tie our hands.  Yes there is maintenance though.  I 
think what I objected to most in the description of this issue was the notion 
that, because Lucene-core doesn't use something and because there is 
maintenance to that something, then we should delete that something.  I get the 
maintenance aspect but we need community input on such decisions to ascertain 
real-world use.

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Priority: Major
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-04-23 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448287#comment-16448287
 ] 

Simon Willnauer commented on LUCENE-8267:
-

+1 to what [~rcmuir] said so many more efficient options

{quote}Do you mean to say I should have said all I said without voting first? 
Lets have a conversation! (we _are_ having a conversation){quote}

So I perceive your veto as an aggressive step. To me it's a last resort after 
we can't find a solution that is good for all of us. The conversation already 
has a tone that is not appropriate and could have been prevented by formulating 
objections as questions. like _I am using this postings format in X and it's 
serving well, what are the alternatives._ - I am sure you would have got an 
awesome answer.

{quote}I don't understand this point of view; can you please elaborate? Fear of 
what?{quote}

if you can't remove stuff without others jumping in vetoing the reaction will 
be to prevent additions in the same way due to _fear_  created by the veto. 
This is a terrible place to be in, we have seen this in the past we should 
prevent it.

 

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Priority: Major
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-04-23 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448272#comment-16448272
 ] 

Robert Muir commented on LUCENE-8267:
-

There are a lot of other alternatives to putting data in heap memory directly 
in the postings format.

The best (IMO) is for the user to use MMapDirectory.preload with the standard 
index format. This way it doesn't impact their java heap and they use supported 
index format. Users can also use RAMDirectory/FileSwitchDirectory to load 
specified files into heap. 

Finally, users can use FSTPostingsFormat which will load *term dictionary only* 
into heap fst. This is way different than Memory/Direct which load not only 
terms but also postings lists and positions and stuff all into heap RAM.

So i don't really see any technical merit for your objection: there are many 
other ways to have a ram-resident terms dictionary, many of them better than 
the inefficient Memory/Direct formats.

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Priority: Major
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-04-23 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448249#comment-16448249
 ] 

David Smiley commented on LUCENE-8267:
--

bq. given that you know that you are using your veto here we are already in a 
terrible position to have any conversation

Do you mean to say I should have said all I said without voting first?  Lets 
have a conversation!  (we _are_ having a conversation)

bq.  we will have a super hard time adding stuff. It creates fear driven 
decisions.

I don't understand this point of view; can you please elaborate?  Fear of what?

bq.  Can you quantify the "it's nice"?

Yes, I shall do that.  My preferred route to do this is find an existing user 
of the "Solr Text Tagger" who can experiment with the postingsFormat setting to 
try a comparison with the default format.  Failing that, I'll create a 
benchmark using that project.

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Priority: Major
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-04-23 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448208#comment-16448208
 ] 

Simon Willnauer commented on LUCENE-8267:
-

{quote}
If we are going to make it harder to remove stuff, I have no problem being the 
one to make it equally harder to add stuff.
 \{quote}
 
I agree this is one of these issues that we have to face. if we put the bar 
very high to remove stuff that is not mainstream then we will have a super hard 
time adding stuff. It creates fear driven decisions. It sucks I agree with 
[~rcmuir] 100% here.
 
{quote}
-1 sorry. I've used the MemoryPostingsFormat for a text-tagging use-case where 
there are intense lookups against the terms dictionary. It's highly beneficial 
to have the terms dictionary be entirely memory resident, albeit in a compact 
FST. The issue description mentions "We don't use those memory codecs anywhere 
outside of tests" – this should be no surprise as it's not the default codec. 
I'm sure it may be hard to gauge the level of use of something outside of 
core-Lucene. When we ponder removing something that Lucene doesn't even _need_, 
I propose we raise the issue more openly to the community. Perhaps the question 
could be proposed in CHANGES.txt and/or release announcements to solicit 
community input?
{quote}
 
given that you know that you are using your veto here we are already in a 
terrible position to have any conversation. Can you quantify the "it's nice"? 
since there are alternatives that (standard codec) can you go and provide some 
numbers. We should not use vetos based on non-quantifiable arguments IMO. We 
can go and ask the community but I don't expect much useful outcome, most of 
the folks don't know what they are using here and there. Nevertheless, I am 
happy to send a mail to dev to get this information. 

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Priority: Major
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-04-23 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448143#comment-16448143
 ] 

Robert Muir commented on LUCENE-8267:
-

If we are going to make it harder to remove stuff, I have no problem being the 
one to make it equally harder to add stuff.

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Priority: Major
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-04-23 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448123#comment-16448123
 ] 

David Smiley commented on LUCENE-8267:
--

-1 sorry. I've used the MemoryPostingsFormat for a text-tagging use-case where 
there are intense lookups against the terms dictionary.  It's highly beneficial 
to have the terms dictionary be entirely memory resident, albeit in a compact 
FST.  The issue description mentions "We don't use those memory codecs anywhere 
outside of tests" -- this should be no surprise as it's not the default codec.  
I'm sure it may be hard to gauge the level of use of something outside of 
core-Lucene.  When we ponder removing something that Lucene doesn't even 
_need_, I propose we raise the issue more openly to the community.  Perhaps the 
question could be proposed in CHANGES.txt and/or release announcements to 
solicit community input?

Perhaps BaseRangeFieldQueryTestCase.verify should ascertain if the postings 
format is a known "memory" postings format (of which there are several, to 
include "Direct"), and if so then use JUnit's Assume to bail out?  If this is 
hard to do, we ought to add a convenience method to make it easier.

Speaking of memory postings formats, I'm in favor of the Direct postings format 
going away since it ought to be re-imagined as some sort of read-time 
FilterCodecReader that does not require an index format.  Credit to Alan for 
that idea years ago.  Though that's more of a re-orientation of something that 
exists rather than saying it should go away entirely.

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Priority: Major
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-04-23 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448060#comment-16448060
 ] 

Robert Muir commented on LUCENE-8267:
-

+1

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Priority: Major
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-04-23 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16447787#comment-16447787
 ] 

Simon Willnauer commented on LUCENE-8267:
-

+1

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Priority: Major
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8267) Remove memory codecs from the codebase

2018-04-23 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16447751#comment-16447751
 ] 

Adrien Grand commented on LUCENE-8267:
--

+1

> Remove memory codecs from the codebase
> --
>
> Key: LUCENE-8267
> URL: https://issues.apache.org/jira/browse/LUCENE-8267
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Priority: Major
>
> Memory codecs (MemoryPostings*, MemoryDocValues*) are part of random 
> selection of codecs for tests and cause occasional OOMs when a test with huge 
> data is selected. We don't use those memory codecs anywhere outside of tests, 
> it has been suggested to just remove them to avoid maintenance costs and OOMs 
> in tests. [1]
> [1] https://apache.markmail.org/thread/mj53os2ekyldsoy3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org