[jira] [Commented] (SOLR-13312) write out responses without creating SolrDocument objects

2019-03-19 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796581#comment-16796581
 ] 

Noble Paul commented on SOLR-13312:
---

bq.So pseudo code based on your response

yeah, pretty much, that is how it should work

bq.That might work if the codec/code for writing out responses only ever 
iterates linearly through the document anyway... 

That's what response writers do anyway. 

bq.One might even imagine a composition based strategy with an 
.optimizeFieldAccess() method that flips the map 

yeah, we can have a whitelist of methods which can be accessed without creating 
the Map. say, 

* {{forEach()}} 
* {{writeMap()}},
* {{getFieldValue())}}
* {{getFirstValue()}}

if any other method is invoked, we can lazily construct the Map based structure 
that we use today


> write out responses without creating SolrDocument objects
> -
>
> Key: SOLR-13312
> URL: https://issues.apache.org/jira/browse/SOLR-13312
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>
> Once we get a document from lucene there is no need to create a SolrDocument 
> object to write out the response, if there are no transformers



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13312) write out responses without creating SolrDocument objects

2019-03-19 Thread Gus Heck (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796571#comment-16796571
 ] 

Gus Heck commented on SOLR-13312:
-

That interface sounds interesting. So pseudo code based on your response:
{code:java}
docInter = wrapOrConvert(luceneDoc, have_transformers)
for(transformers:t)
  t.transform(docInter)
sendResponse(convertDocToWt(wt, docInter){code}
That might work if the codec/code for writing out responses only ever iterates 
linearly through the document anyway... which seems likely for writing a 
response. If the interface provides direct field access, the performance of 
field access would vary depending on the impl behind it, one favoring memory at 
the expense of cpu the other favoring cpu at the expense of memory (for cases 
expecting lots of direct field access). Certain use cases (low mem systems) 
might want to force the tradeoff regardless.

One might even imagine a composition based strategy with an 
.optimizeFieldAccess() method that flips the map based backing implementation 
on by swapping in a SolrDocument as a new delegate on demand, so that 
transformers that do nothing but add one more field don't have to require the 
more memory expensive implementation either.

Maybe convert the current SolrDocument class to an inner class of a wrapper 
that takes it's name, and that wrapper that can delegate either to a lucene doc 
or the current impl. Then have an optimizeForFiledAccess() method that code in 
transfomrers (or elsewhere) can call to hint that a map based backing may be 
helpful for performance (I imagine perhaps allowing a sysprop or config setting 
to deny this request for memory constrained systems or systems handling 
documents with very few fields). A new constructor would create the lucene 
backed version, and the existing constructors create one backed by maps as 
before... 

Certain methods such as "getFieldValuesAsMap()" might automatically cause 
conversion...

Just a thought. 

 

> write out responses without creating SolrDocument objects
> -
>
> Key: SOLR-13312
> URL: https://issues.apache.org/jira/browse/SOLR-13312
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>
> Once we get a document from lucene there is no need to create a SolrDocument 
> object to write out the response, if there are no transformers



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13312) write out responses without creating SolrDocument objects

2019-03-19 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796466#comment-16796466
 ] 

Noble Paul commented on SOLR-13312:
---

bq. By Transformer, you mean DocumentTransformer (e.g. fl=id,name,score,[shard] 
to add the shard info)?

Well, the problem is that we have the DocTransformers working on a concrete 
class called {{SolrDocument}} . We should be able to make transformers work on 
an interface. So , any DocTransformer which implements the new interface can 
possibly work. The most common ones that we ship today can cut over to the 
interface.

bq. When you say "write it out" do you mean directly generating 
JSON/XML/javabin? I think javabin would requrie creating a SolrDocument If not, 
the client side changes too.

No. the output format remains same. There will be zero changes. So, even an 
older client should have no problem in communicating with a new Solr 

bq. but I it seems that javabin is shipping a serialized SolrDocument...

Javabin is a serialization/deserialization format. it is very well possible to 
construct that format without creating an Object.


> write out responses without creating SolrDocument objects
> -
>
> Key: SOLR-13312
> URL: https://issues.apache.org/jira/browse/SOLR-13312
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>
> Once we get a document from lucene there is no need to create a SolrDocument 
> object to write out the response, if there are no transformers



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13312) write out responses without creating SolrDocument objects

2019-03-19 Thread Gus Heck (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796454#comment-16796454
 ] 

Gus Heck commented on SOLR-13312:
-

By Transformer, you mean DocumentTransformer (e.g. fl=id,name,score,[shard] to 
add the shard info)? So most vanilla query use cases don't use those, including 
those underpinning streaming expression searches? When you say "write it out" 
do you mean directly generating JSON/XML/javabin? I think javabin would requrie 
creating a SolrDocument If not, the client side changes too... (though perhaps 
shipping the SolrDocument creation load to the solrj client will be of benefit)

I'm not an expert on the codec having never had cause to work with it directly, 
but I it seems that javabin is shipping a serialized SolrDocument 
(org.apache.solr.common.util.JavaBinCodec#readSolrDocument). If the binary wire 
format changes you probably are proposing javabin2? (or lucenebin?) in that 
case it becomes slightly confusing since adding ,[shard] could require the wt 
param to change. 

Perhaps I'm entirely misinterpreting what you said. A patch would probably 
clarify. I'm not for or against this yet, but the description seems short.

> write out responses without creating SolrDocument objects
> -
>
> Key: SOLR-13312
> URL: https://issues.apache.org/jira/browse/SOLR-13312
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>
> Once we get a document from lucene there is no need to create a SolrDocument 
> object to write out the response, if there are no transformers



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13312) write out responses without creating SolrDocument objects

2019-03-16 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794346#comment-16794346
 ] 

Noble Paul commented on SOLR-13312:
---

Tradeoffs?
If you have a transformer it won't use this mode. If not, it should directly 
write out. There will be no difference for anything else. 
We are not giving up anything. The changes will not affect any other part of 
the system. What I mean to say is, the changes are not cross cutting. 

The performance delta will be measured after a PoC is written. We will see how 
much of an improvement we get and go further from there

> write out responses without creating SolrDocument objects
> -
>
> Key: SOLR-13312
> URL: https://issues.apache.org/jira/browse/SOLR-13312
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>
> Once we get a document from lucene there is no need to create a SolrDocument 
> object to write out the response, if there are no transformers



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13312) write out responses without creating SolrDocument objects

2019-03-16 Thread Gus Heck (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794311#comment-16794311
 ] 

Gus Heck commented on SOLR-13312:
-

Improving memory efficiency seems nice, but like all optimization there are 
likely to be trade offs. What use cases will you be evaluating for memory, CPU, 
and overall response time? If we can win on all fronts with a variety of use 
cases, (withou making dev too difficult) great, but if were giving up 
something, some of the time we need to know. This sounds like a pretty cross 
cutting change that could effect many things

> write out responses without creating SolrDocument objects
> -
>
> Key: SOLR-13312
> URL: https://issues.apache.org/jira/browse/SOLR-13312
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>
> Once we get a document from lucene there is no need to create a SolrDocument 
> object to write out the response, if there are no transformers



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13312) write out responses without creating SolrDocument objects

2019-03-10 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789077#comment-16789077
 ] 

Noble Paul commented on SOLR-13312:
---

[~dsmiley] SolrDocument is an expensive data structure. Yes, we may need a more 
efficient data structure to actually accomplish this. HashMaps are extremely 
memory inefficient.
Skipping transformation is something we can't do now without backward 
incompatibility. We can probably rewrite Transformers like ChildDocTransformer 
to adapt to the new format. We may need to create the SolrDocument objects 
where other transformers are used. But most requests never use any 
transformers. They are paying a huge price


> write out responses without creating SolrDocument objects
> -
>
> Key: SOLR-13312
> URL: https://issues.apache.org/jira/browse/SOLR-13312
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>
> Once we get a document from lucene there is no need to create a SolrDocument 
> object to write out the response, if there are no transformers



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13312) write out responses without creating SolrDocument objects

2019-03-10 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789008#comment-16789008
 ] 

David Smiley commented on SOLR-13312:
-

If I'm not mistaken, I believe a major purpose of SolrDocument is to map a 
field to a list of values.  Lucene's Document has it all flat and in no 
particular order.  Wouldn't this be more difficult to work with?  I'm skeptical 
how much value there is in skipping the transformation as well.

> write out responses without creating SolrDocument objects
> -
>
> Key: SOLR-13312
> URL: https://issues.apache.org/jira/browse/SOLR-13312
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>
> Once we get a document from lucene there is no need to create a SolrDocument 
> object to write out the response, if there are no transformers



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org