Hi Tony,

You're correct; The global dictionary wasn't supported in stream builder
(this is the first reporting); Could you please open a JIRA?
https://issues.apache.org/jira/secure/Dashboard.jspa

BTW, we're developing the new version of streaming engine, which will reuse
most of the logic of batch cubing engine, planned to roll out in v1.6. I
believe with the new design there will have no such issue.

2016-09-26 14:56 GMT+08:00 Tony Lee <btony...@gmail.com>:

> Thanks
>
> But this does not work on streaming cube.
>
> I read some code and found that in class *StreamingCubeBuilder,* the
> dictionary map was built by *DictionaryGenerator.buildDictionary()*
> instead of *DictionaryManager.buildDictionary()*. Does this mean that
> streaming cube does not support global dictionary?
>
> I add USERID to the dimensions, then the cube was built successfully. But
> I think the result will be incorrect if I calculate count distinct in
> different segments. Is that right
>
>
> Tony
>
> On Sat, Sep 24, 2016 at 10:29 PM, ShaoFeng Shi <shaofeng...@apache.org>
> wrote:
>
>> Hi Tony,
>>
>> The error was occurred when building a bitmap counter (for distinct
>> count); from your cube descriptor, it seems there is no global dictionary
>> be specified for the user id column. Please check this blog:
>> https://kylin.apache.org/blog/2016/08/01/count-distinct-in-kylin/
>>
>> 2016-09-22 10:49 GMT+08:00 Tony Lee <btony...@gmail.com>:
>>
>>> Thanks, ShaoFeng Shi. That is the reason.
>>>
>>> But unfortunately, I have a new problem about count distinct (precisely)
>>>
>>> I  added a streaming table on version 1.5.4 with my own json, which is
>>> like this
>>> {
>>>     "logTimestamp":1474456891127,
>>>     "datetime":"2016-09-21 19:21:31",
>>>     "uploadTime":"20160921192023",
>>>     "userId":"f2d28cbf9e21340a49e97063486db1f5",
>>>     "accountId":"84108490",
>>>     "otherfield":"...."
>>> }
>>>
>>> *The error message while building the cube is*
>>>
>>> 2016-09-22 10:01:40,731 ERROR [main StreamingCLI:103]: error start
>>> streaming
>>> java.lang.RuntimeException: error build cube from StreamingBatch
>>>         at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>>> build(StreamingCubeBuilder.java:105)
>>>         at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.r
>>> un(OneOffStreamingBuilder.java:79)
>>>         at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>>> ffCubeStreaming(StreamingCLI.java:123)
>>>         at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(Stre
>>> amingCLI.java:97)
>>> Caused by: java.lang.NullPointerException
>>>         at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf(
>>> BitmapMeasureType.java:100)
>>>         at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf(
>>> BitmapMeasureType.java:89)
>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>>> rter.buildValueOf(InMemCubeBuilderInputConverter.java:122)
>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>>> rter.buildValue(InMemCubeBuilderInputConverter.java:94)
>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve
>>> rter.convert(InMemCubeBuilderInputConverter.java:70)
>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConv
>>> erter$1.next(InMemCubeBuilder.java:542)
>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConv
>>> erter$1.next(InMemCubeBuilder.java:523)
>>>         at org.apache.kylin.gridtable.GTAggregateScanner.iterator(GTAgg
>>> regateScanner.java:139)
>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.createBas
>>> eCuboid(InMemCubeBuilder.java:339)
>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InM
>>> emCubeBuilder.java:166)
>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InM
>>> emCubeBuilder.java:135)
>>>         at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build(InM
>>> emCubeBuilder.java:122)
>>>         at org.apache.kylin.cube.inmemcubing.AbstractInMemCubeBuilder$1
>>> .run(AbstractInMemCubeBuilder.java:80)
>>>         at java.util.concurrent.Executors$RunnableAdapter.call(Executor
>>> s.java:471)
>>>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1145)
>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:615)
>>>         at java.lang.Thread.run(Thread.java:745)
>>>
>>>
>>> *and the cube json is*
>>> {
>>>   "uuid": "db91bcea-b33f-48af-a2f5-6014b14031f4",
>>>   "last_modified": 1474511879506,
>>>   "version": "1.5.4",
>>>   "name": "hot_play_c",
>>>   "model_name": "hot_play_cube",
>>>   "description": "",
>>>   "null_string": null,
>>>   "dimensions": [
>>>     {
>>>       "name": "DEFAULT.HOT_PLAY.HOUR_START",
>>>       "table": "DEFAULT.HOT_PLAY",
>>>       "column": "HOUR_START",
>>>       "derived": null
>>>     },
>>>     {
>>>       "name": "DEFAULT.HOT_PLAY.MINUTE_START",
>>>       "table": "DEFAULT.HOT_PLAY",
>>>       "column": "MINUTE_START",
>>>       "derived": null
>>>     }
>>>   ],
>>>   "measures": [
>>>     {
>>>       "name": "_COUNT_",
>>>       "function": {
>>>         "expression": "COUNT",
>>>         "parameter": {
>>>           "type": "constant",
>>>           "value": "1",
>>>           "next_parameter": null
>>>         },
>>>         "returntype": "bigint"
>>>       },
>>>       "dependent_measure_ref": null
>>>     },
>>>     {
>>>       "name": "COUNT_DISTINCT_USER",
>>>       "function": {
>>>         "expression": "COUNT_DISTINCT",
>>>         "parameter": {
>>>           "type": "column",
>>>           "value": "USERID",
>>>           "next_parameter": null
>>>         },
>>>         "returntype": "bitmap"
>>>       },
>>>       "dependent_measure_ref": null
>>>     }
>>>   ],
>>>   "dictionaries": [],
>>>   "rowkey": {
>>>     "rowkey_columns": [
>>>       {
>>>         "column": "HOUR_START",
>>>         "encoding": "time",
>>>         "isShardBy": false
>>>       },
>>>       {
>>>         "column": "MINUTE_START",
>>>         "encoding": "time",
>>>         "isShardBy": false
>>>       }
>>>     ]
>>>   },
>>>   "hbase_mapping": {
>>>     "column_family": [
>>>       {
>>>         "name": "F1",
>>>         "columns": [
>>>           {
>>>             "qualifier": "M",
>>>             "measure_refs": [
>>>               "_COUNT_"
>>>             ]
>>>           }
>>>         ]
>>>       },
>>>       {
>>>         "name": "F2",
>>>         "columns": [
>>>           {
>>>             "qualifier": "M",
>>>             "measure_refs": [
>>>               "COUNT_DISTINCT_USER"
>>>             ]
>>>           }
>>>         ]
>>>       }
>>>     ]
>>>   },
>>>   "aggregation_groups": [
>>>     {
>>>       "includes": [
>>>         "HOUR_START",
>>>         "MINUTE_START"
>>>       ],
>>>       "select_rule": {
>>>         "hierarchy_dims": [],
>>>         "mandatory_dims": [],
>>>         "joint_dims": []
>>>       }
>>>     }
>>>   ],
>>>   "signature": "QXddyWCVVCYQcozxd4Zh2w==",
>>>   "notify_list": [],
>>>   "status_need_notify": [
>>>     "ERROR",
>>>     "DISCARDED",
>>>     "SUCCEED"
>>>   ],
>>>   "partition_date_start": 0,
>>>   "partition_date_end": 3153600000000,
>>>   "auto_merge_time_ranges": [
>>>     604800000,
>>>     2419200000
>>>   ],
>>>   "retention_range": 0,
>>>   "engine_type": 2,
>>>   "storage_type": 2,
>>>   "override_kylin_properties": {}
>>> }
>>>
>>> *no error after i change the returntype to hllc(16)*
>>>
>>> *i have struggled for several days. Any hints about this?*
>>>
>>> On Wed, Sep 21, 2016 at 10:47 PM, ShaoFeng Shi <shaofeng...@apache.org>
>>> wrote:
>>>
>>>> Hi Tony,
>>>>
>>>> It seems your cube isn't partitioned (no partition date column
>>>> specified); please check or provide the cube JSON.
>>>>
>>>> 2016-09-21 0:30 GMT+08:00 Alberto Ramón <a.ramonporto...@gmail.com>:
>>>>
>>>>> I don't know but , can you check this change?: KYLIN-1744
>>>>> <https://issues.apache.org/jira/browse/KYLIN-1744> in V1.3
>>>>>
>>>>>
>>>>> 2016-09-20 14:50 GMT+02:00 Tony Lee <btony...@gmail.com>:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I was building cube from stream as the document(http://kylin.apache.o
>>>>>> rg/docs15/tutorial/cube_streaming.html
>>>>>>
>>>>>> ) says.
>>>>>>
>>>>>> I was using 1.5.3, and i encounter this error. Same error on 1.5.4.
>>>>>> Everything fine on 1.5.2.1.
>>>>>>
>>>>>> Any idea how to solve this?
>>>>>>
>>>>>>
>>>>>> 2016-09-20 20:31:51,520 INFO  [main KafkaStreamingInput:129]: finish
>>>>>> to get streaming batch, total message count:30
>>>>>> 2016-09-20 20:31:51,532 DEBUG [main CubeManager:855]: Reloaded new
>>>>>> cube: STREAMING_CUBE with reference beingCUBE[name=STREAMING_CUBE] 
>>>>>> having 1
>>>>>> segments:KYLIN_2822I1W3CX
>>>>>> 2016-09-20 20:31:51,536 INFO  [main CubeManager:314]: Updating cube
>>>>>> instance 'STREAMING_CUBE'
>>>>>> 2016-09-20 20:31:51,538 WARN  [main StreamingCLI:127]: invalid
>>>>>> args:streaming start STREAMING_CUBE 1474374540000_1474374600000 -start
>>>>>> 1474374540000 -end 1474374600000 -cube STREAMING_CUBE
>>>>>> 2016-09-20 20:31:51,539 ERROR [main StreamingCLI:103]: error start
>>>>>> streaming
>>>>>> java.lang.IllegalStateException: Segments overlap:
>>>>>> STREAMING_CUBE[FULL_BUILD] and STREAMING_CUBE[FULL_BUILD]
>>>>>> at org.apache.kylin.cube.CubeValidator.validate(CubeValidator.j
>>>>>> ava:85)
>>>>>> at org.apache.kylin.cube.CubeManager.updateCubeWithRetry(CubeMa
>>>>>> nager.java:358)
>>>>>> at org.apache.kylin.cube.CubeManager.updateCube(CubeManager.java:301)
>>>>>> at org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.
>>>>>> java:441)
>>>>>> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder.
>>>>>> createBuildable(StreamingCubeBuilder.java:118)
>>>>>> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.r
>>>>>> un(OneOffStreamingBuilder.java:76)
>>>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO
>>>>>> ffCubeStreaming(StreamingCLI.java:123)
>>>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(Stre
>>>>>> amingCLI.java:97)
>>>>>> 2016-09-20 20:31:51,543 INFO  [Thread-0 
>>>>>> ConnectionManager$HConnectionImplementation:1678]:
>>>>>> Closing zookeeper sessionid=0x35708fbc2740013
>>>>>> 2016-09-20 20:31:51,549 INFO  [Thread-0 ZooKeeper:684]: Session:
>>>>>> 0x35708fbc2740013 closed
>>>>>> 2016-09-20 20:31:51,549 INFO  [main-EventThread ClientCnxn:512]:
>>>>>> EventThread shut down
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>>
>>>> Shaofeng Shi 史少锋
>>>>
>>>>
>>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>


-- 
Best regards,

Shaofeng Shi 史少锋

Reply via email to