Hi Please find the list of known shortcomings in ES driver. Some of them
like the one mentioned by you in the previous email s mainly due to the way
ES operates and returns results.

Ill add these to documentation in sometime..

-> Scrolling responses for aggregate queries (by design ES always returns
the complete bucket in a single json - there is no scroll facility)
-> Order by in aggregate queries. Fully functional order by queries can get
complex as the ordering by measure can happen only in the immediate parent
group by. 'Limit' is also blocked as it could be misleading to have limit
without order by. (Please note that order by and limit will still work in
queries without group bys)
-> *, count is not available as of now.
-> support for other UDFs. Right now common UDAFs like sum, min, max are
supported. We need a way to seamlessly translate a new UDF to elastic
search without code change
-> Query estimation
-> Session level config injection for properties like fetch size and group
by cardinality size? (Right now these configs are at driver level)

Thanks,
Amruth S
(09486075517)

On Wed, Jul 6, 2016 at 11:29 AM, Gary Wu <[email protected]> wrote:

> Hi ,
> I also find "*select col_2, sum(col_2)  from  center.test limit 100*" is
> not supported too.
>>
>> lens-shell>select col_2, sum(col_2)  from  center.test limit 100
>>
>> 06 七月 2016 05:56:43 [Spring Shell] INFO  cliLogger - Query handle:
>>> 54c9b8a6-816e-492d-b1a1-ba73511653ea
>>
>> 06 七月 2016 05:56:43 [Spring Shell] INFO  cliLogger - User query: 'select
>>> col_2, sum(col_2) from center.test limit 100' was submitted to es/es1
>>
>> 06 七月 2016 05:56:43 [Spring Shell] INFO  cliLogger -  Driver query:
>>> 'select col_2, sum(col_2) from center.test limit 100' and Driver handle:
>>> null
>>
>> Command failed java.lang.IllegalStateException: Failed to get resultset
>>> metadata, cause:HTTP 500 Internal Server Error
>>
>> lens-shell>
>
> Thanks
>
> On Wed, Jul 6, 2016 at 1:52 PM, Gary Wu <[email protected]> wrote:
>
>> Hi Amruth
>> Thank you for your help.
>> Yeah, the "*select col_2  from  center.test limit 100*" returns the
>> right result. But "*select col_2  from  center.test*" returns nothing.
>> I also check the es driver configuration file, and find "
>> *lens.driver.es.max.row.size*" is set to -1 .
>> ps:  My elasticsearch version is 2.3.2.
>> My Lens version is 2.5.0-beta
>>
>> The Lens server configuration file :
>>
>>>   <property>
>>>
>>>     <name>lens.driver.es.term.fetch.size</name>
>>>
>>>     <description>Fetch (buffer) size for document look up
>>>> queries</description>
>>>
>>>     <value>10000</value>
>>>
>>>   </property>
>>>
>>>   <property>
>>>
>>>     <name>lens.driver.es.query.timeout.millis</name>
>>>
>>>     <description>Query timeout</description>
>>>
>>>     <value>10000</value>
>>>
>>>   </property>
>>>
>>>   <property>
>>>
>>> *    <name>lens.driver.es.max.row.size</name>*
>>>
>>> *    <description>max rows for es document look up queries, non existent
>>>> or -1 refers no limit</description>*
>>>
>>> *    <value>-1</value>*
>>>
>>>   </property>
>>>
>>>   <property>
>>>
>>>     <name>lens.driver.es.aggr.bucket.size</name>
>>>
>>>     <description>Max cardinality of group by (higher value means higher
>>>> resource usage at server end)</description>
>>>
>>>     <value>6</value>
>>>
>>>   </property>
>>>
>>>   <property>
>>>
>>>     <name>lens.driver.es.jest.servers</name>
>>>
>>>     <description>List of http servers, will be used on a round robin
>>>> basis</description>
>>>
>>>     <value>http://10.10.44.21:19200</value>
>>>
>>>   </property>
>>>
>>>   <property>
>>>
>>>     <name>lens.driver.es.jest.max.conn</name>
>>>
>>>     <description>max connections</description>
>>>
>>>     <value>20</value>
>>>
>>>   </property>
>>>
>>>   <property>
>>>
>>>     <name>lens.driver.es.client.class</name>
>>>
>>>     <description>Choice of client class, default is
>>>> JestClientImpl</description>
>>>
>>>     <value>org.apache.lens.driver.es.client.jest.JestClientImpl</value>
>>>
>>>   </property>
>>>
>>> </configuration>
>>>
>>>
>> the command result is as follow:
>>
>>> lens-shell>*select col_2  from  center.test limit 100*
>>>
>>> 06 七月 2016 05:39:28 [Spring Shell] INFO  cliLogger - Query handle:
>>>> 1c602a46-118c-45fb-be93-ad62717f5d45
>>>
>>> 06 七月 2016 05:39:28 [Spring Shell] INFO  cliLogger - User query: 'select
>>>> col_2 from center.test limit 100' was submitted to es/es1
>>>
>>> 06 七月 2016 05:39:28 [Spring Shell] INFO  cliLogger -  Driver query:
>>>> 'select col_2 from center.test limit 100' and Driver handle: null
>>>
>>> 06 七月 2016 05:39:28 [Spring Shell] INFO  cliLogger - Query Status:
>>>> Status : LAUNCHED
>>>
>>>  Progress : 0.0
>>>
>>>
>>>
>>> 06 七月 2016 05:39:28 [Spring Shell] INFO  cliLogger - Query Status:
>>>> Status : SUCCESSFUL
>>>
>>>  Message : Query is successful!
>>>
>>>  Progress : 1.0
>>>
>>>  Result Available
>>>
>>> col_2
>>>
>>> *Result available in memory, attaching here: *
>>>
>>>
>>>> *5411.0    *
>>>
>>> *5433.0    *
>>>
>>> *5678.0    *
>>>
>>> *3 rows processed in (0) seconds.*
>>>
>>>
>>
>> lens-shell>select col_2  from  center.test
>>
>> 06 七月 2016 05:41:38 [Spring Shell] INFO  cliLogger - Query handle:
>>> df5be094-69a2-4cc7-b8d3-bd6245e63472
>>
>> 06 七月 2016 05:41:38 [Spring Shell] INFO  cliLogger - User query: 'select
>>> col_2 from center.test' was submitted to es/es1
>>
>> 06 七月 2016 05:41:38 [Spring Shell] INFO  cliLogger -  Driver query:
>>> 'select col_2 from center.test' and Driver handle: null
>>
>> *col_2    *
>>
>> *Result available in memory, attaching here: *
>>
>>
>>> *0 rows processed in (0) seconds.*
>>
>>
>>> *lens-shell>*
>>
>>
>> Thanks
>> Gary
>>
>> On Wed, Jul 6, 2016 at 1:00 PM, Amruth Sampath <[email protected]>
>> wrote:
>>
>>> Hi Gary, Yes * is also not supported just yet in case of elastic search
>>> driver.
>>>
>>> I could see that you are getting the results successfully for the
>>> aggregation query :
>>>
>>> *col_2    *
>>>
>>> *Result available in memory, attaching here: *
>>>
>>>
>>>> *16522.0 *
>>>
>>>
>>> I am not sure whats the problem with this case though "*select col_2
>>>  from  center.test". *It should have given you results limited to
>>> lens.driver.es.max.row.size.
>>>
>>> Ill try to debug this in the evening. Can you put down the versions of
>>> lens and elastic search you are using, Ill try to reproduce.
>>>
>>> Meanwhile can you try adding a limit clause and check if you are getting
>>> the result.
>>> *select col_2  from  center.test limit 100*
>>>
>>> Thanks,
>>>
>>> Thanks,
>>> Amruth S
>>> (09486075517)
>>>
>>> On Wed, Jul 6, 2016 at 10:16 AM, Gary Wu <[email protected]>
>>> wrote:
>>>
>>>> involve all.
>>>>
>>>> Hi Amruth and Amareshwarisr
>>>> Thank you for your reply. Lens is a great software, the new querying
>>>> engine is very cool.  :)
>>>> Follow your instruction, I run a client-cli and input some commands in
>>>> the console.
>>>> I found some confusing test results:
>>>> 1) the  aggregation operation(sum/ count/ max) could return a result.
>>>> but select *returns nothing*.  Also I find *  is not supported too.
>>>> Dose I neglect anything which is important?
>>>>
>>>> lens-shell>*select sum(col_2)  from  center.test*
>>>>>
>>>>> 06 七月 2016 03:56:42 [Spring Shell] INFO  cliLogger - Query handle:
>>>>>> 55d39cb3-1e34-419c-956e-27d537fcba66
>>>>>
>>>>> 06 七月 2016 03:56:42 [Spring Shell] INFO  cliLogger - User query:
>>>>>> 'select sum(col_2) from center.test' was submitted to es/es1
>>>>>
>>>>> 06 七月 2016 03:56:42 [Spring Shell] INFO  cliLogger -  Driver query:
>>>>>> 'select sum(col_2) from center.test' and Driver handle: null
>>>>>
>>>>> 06 七月 2016 03:56:42 [Spring Shell] INFO  cliLogger - Query Status:
>>>>>> Status : LAUNCHED
>>>>>
>>>>>  Progress : 0.0
>>>>>
>>>>>
>>>>>
>>>>> 06 七月 2016 03:56:42 [Spring Shell] INFO  cliLogger - Query Status:
>>>>>> Status : SUCCESSFUL
>>>>>
>>>>>  Message : Query is successful!
>>>>>
>>>>>  Progress : 1.0
>>>>>
>>>>>  Result Available
>>>>>
>>>>> *col_2    *
>>>>>
>>>>> *Result available in memory, attaching here: *
>>>>>
>>>>>
>>>>>> *16522.0    *
>>>>>
>>>>> *1 rows processed in (0) seconds*
>>>>>
>>>>>
>>>>
>>>> lens-shell>*select col_2  from  center.test*
>>>>
>>>> 06 七月 2016 04:02:22 [Spring Shell] INFO  cliLogger - Query handle:
>>>>> 00c29269-67d0-4fcb-af69-0b3152a5e0ba
>>>>
>>>> 06 七月 2016 04:02:22 [Spring Shell] INFO  cliLogger - User query:
>>>>> 'select col_2 from center.test' was submitted to es/es1
>>>>
>>>> 06 七月 2016 04:02:22 [Spring Shell] INFO  cliLogger -  Driver query:
>>>>> 'select col_2 from center.test' and Driver handle: null
>>>>
>>>> 06 七月 2016 04:02:22 [Spring Shell] INFO  cliLogger - Query Status:
>>>>> Status : SUCCESSFUL
>>>>
>>>>  Message : Query is successful!
>>>>
>>>>  Progress : 1.0
>>>>
>>>>  Result Available
>>>>
>>>> *col_2    *
>>>>
>>>> *Result available in memory, attaching here: *
>>>>
>>>>
>>>>> *0 rows processed in (0) seconds.*
>>>>
>>>>
>>>> 2) all the commands was submitted to the new ES(showed in the log)
>>>> automatically, though I didn't point out where the the table is .
>>>> I have several data sources, hive jdbc and es .etc . So I guess the
>>>> Lens will try all available data sources( or store the meta data), and then
>>>> find ES has the table, then Lens submits the request to the es.  Is it
>>>>  right?
>>>>
>>>> 06 七月 2016 03:56:42 [Spring Shell] INFO  cliLogger - User query:
>>>>> 'select sum(col_2) from center.test' was submitted to es/es1
>>>>
>>>>
>>>> Thanks
>>>>
>>>> On Wed, Jul 6, 2016 at 12:07 PM, Amruth Sampath <[email protected]>
>>>> wrote:
>>>>
>>>>> Gary,
>>>>>
>>>>> The general format is
>>>>>
>>>>> *select <col1>, <col2> ... from <index.type>;*
>>>>>
>>>>> Join is not supported. Basic aggregations, group by's are supported
>>>>>
>>>>> In your case probably,
>>>>>
>>>>> *select col1, col2 from plant.flower;*
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Amruth S
>>>>> (09486075517)
>>>>>
>>>>> On Wed, Jul 6, 2016 at 8:18 AM, amareshwarisr . <[email protected]
>>>>> > wrote:
>>>>>
>>>>>> Gary,
>>>>>>
>>>>>> I think you should be able to query es index simply with SQL, no need
>>>>>> to create any dimtable or fact table in lens, unless they are part of a
>>>>>> cube you are trying.
>>>>>>
>>>>>> Amruth, Can you help Gary on running queries on elastic search?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 5, 2016 at 5:54 PM, Gary Wu <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Team,
>>>>>>> I am trying to use *elasticsearch *as the data source for lens. I
>>>>>>> do as follow:
>>>>>>> 1) I make a new directory in driver dir and add a xml.
>>>>>>>
>>>>>>>> ..../server/conf/drivers/
>>>>>>>>
>>>>>>>>
>>>>>>>> *├── es│   └── es1│       └── esdriver-site.xml*
>>>>>>>> ├── hive
>>>>>>>> │   └── hive1
>>>>>>>> │       └── hivedriver-site.xml
>>>>>>>> └── jdbc
>>>>>>>>     └── jdbc1
>>>>>>>>         └── jdbcdriver-site.xml
>>>>>>>
>>>>>>>
>>>>>>> <?xml version="1.0"?>
>>>>>>>>>
>>>>>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>>>>>>
>>>>>>> <configuration>
>>>>>>>>
>>>>>>>>   <property>
>>>>>>>>
>>>>>>>>     <name>lens.driver.es.term.fetch.size</name>
>>>>>>>>
>>>>>>>>     <description>Fetch (buffer) size for document look up
>>>>>>>>> queries</description>
>>>>>>>>
>>>>>>>>     <value>10000</value>
>>>>>>>>
>>>>>>>>   </property>
>>>>>>>>
>>>>>>>>   <property>
>>>>>>>>
>>>>>>>>     <name>lens.driver.es.query.timeout.millis</name>
>>>>>>>>
>>>>>>>>     <description>Query timeout</description>
>>>>>>>>
>>>>>>>>     <value>10000</value>
>>>>>>>>
>>>>>>>>   </property>
>>>>>>>>
>>>>>>>>   <property>
>>>>>>>>
>>>>>>>>     <name>lens.driver.es.max.row.size</name>
>>>>>>>>
>>>>>>>>     <description>max rows for es document look up queries, non
>>>>>>>>> existent or -1 refers no limit</description>
>>>>>>>>
>>>>>>>>     <value>-1</value>
>>>>>>>>
>>>>>>>>   </property>
>>>>>>>>
>>>>>>>>   <property>
>>>>>>>>
>>>>>>>>     <name>lens.driver.es.aggr.bucket.size</name>
>>>>>>>>
>>>>>>>>     <description>Max cardinality of group by (higher value means
>>>>>>>>> higher resource usage at server end)</description>
>>>>>>>>
>>>>>>>>     <value>6</value>
>>>>>>>>
>>>>>>>>   </property>
>>>>>>>>
>>>>>>>>   <property>
>>>>>>>>
>>>>>>>>     <name>lens.driver.es.jest.servers</name>
>>>>>>>>
>>>>>>>>     <description>List of http servers, will be used on a round
>>>>>>>>> robin basis</description>
>>>>>>>>
>>>>>>>>     <value>*http://10.10.44.21:19200 <http://10.10.44.21:19200>*
>>>>>>>>> </value>
>>>>>>>>
>>>>>>>>   </property>
>>>>>>>>
>>>>>>>>   <property>
>>>>>>>>
>>>>>>>>     <name>lens.driver.es.jest.max.conn</name>
>>>>>>>>
>>>>>>>>     <description>max connections</description>
>>>>>>>>
>>>>>>>>     <value>20</value>
>>>>>>>>
>>>>>>>>   </property>
>>>>>>>>
>>>>>>>>   <property>
>>>>>>>>
>>>>>>>>     <name>lens.driver.es.client.class</name>
>>>>>>>>
>>>>>>>>     <description>Choice of client class, default is
>>>>>>>>> JestClientImpl</description>
>>>>>>>>
>>>>>>>>
>>>>>>>>> <value>org.apache.lens.driver.es.client.jest.JestClientImpl</value>
>>>>>>>>
>>>>>>>>   </property>
>>>>>>>>
>>>>>>>> </configuration>
>>>>>>>>
>>>>>>>>
>>>>>>> 2) Then I edit  a driver item in lens-site.xml by
>>>>>>>
>>>>>>>> <property>
>>>>>>>>
>>>>>>>>   <name>lens.server.drivers</name>
>>>>>>>>
>>>>>>>>
>>>>>>>>>  
>>>>>>>>> <value>hive:org.apache.lens.driver.hive.HiveDriver,jdbc:org.apache.lens.driver.jdbc.JDBCDriver,
>>>>>>>>> *es:org.apache.lens.driver.es.ESDriver*</value>
>>>>>>>>
>>>>>>>> </property>
>>>>>>>>
>>>>>>>>
>>>>>>> In my elasticsearch (http://10.10.44.21:19200), there is *already *a
>>>>>>> doc,named
>>>>>>> index/type/index  ....
>>>>>>> plant/flower/1 ....
>>>>>>> plant/flower/2 ....
>>>>>>> plant/flower/3 ....
>>>>>>>
>>>>>>> *What should I do next for querying the data from es *? add dim or
>>>>>>> dimtable ? I did not find examples for elasticsearch in lens document.
>>>>>>> Could you give me some instructions or some examples for that ?
>>>>>>>
>>>>>>> Thank you very much.
>>>>>>> Gary
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to