[jira] [Commented] (KYLIN-3434) Support prepare statement in Kylin server side

2018-07-01 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529010#comment-16529010
 ] 

Shaofeng SHI commented on KYLIN-3434:
-

+1 Good idea

> Support prepare statement in Kylin server side
> --
>
> Key: KYLIN-3434
> URL: https://issues.apache.org/jira/browse/KYLIN-3434
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Ma Gang
>Assignee: Ma Gang
>Priority: Major
>
> Kylin use calcite as sql engine, when a sql comes to Kylin server, it 
> requires to be parsed, optimized, code gen, and then query Kylin's cube 
> storage, the previous 3 steps often take 50-150 ms to complete(depends on the 
> complexity of the sql). If we support to cache the parsed result in Kylin 
> server, the 3 steps will be saved.
> The idea is to cache calcite's PreparedStatement object and related 
> OLAPContexts in the server side, when the prepare request comes with the same 
> sql, reuse the PreparedStatement to do the execution. Since the 
> PreparedStatement is not thread safe, so I planned to use ObjectPool to cache 
> the PreparedStatement.(use apache commons-pool lib)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KYLIN-3428) java.lang.OutOfMemoryError: Requested array size exceeds VM limit

2018-07-01 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reassigned KYLIN-3428:
---

Assignee: yangcao

> java.lang.OutOfMemoryError: Requested array size exceeds VM limit
> -
>
> Key: KYLIN-3428
> URL: https://issues.apache.org/jira/browse/KYLIN-3428
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
> Environment: kylin v2.2.0   jdk7
>Reporter: yangcao
>Assignee: yangcao
>Priority: Critical
>  Labels: Build_Base_Cuboid, MAP, OOM
> Fix For: v2.4.1, v2.5.0
>
> Attachments: patch-v1.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> LOG:
> 2018-06-26 15:50:24,032 INFO [main] org.apache.kylin.dict.DictionaryManager: 
> DictionaryManager(1499050426) loading DictionaryInfo(loadDictObj:true) at 
> /dict/xxx.xxx/C7/036b7ca0-8733-4c0c-99f5-5122919fd3dd.dict 2018-06-26 
> 15:50:25,586 ERROR [main] org.apache.kylin.engine.mr.KylinMapper: 
> com.google.common.util.concurrent.ExecutionError: java.lang.OutOfMemoryError: 
> Requested array size exceeds VM limit at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2232) at 
> com.google.common.cache.LocalCache.get(LocalCache.java:3965) at 
> com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3969) at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4829) 
> at 
> org.apache.kylin.dict.DictionaryManager.getDictionaryInfo(DictionaryManager.java:118)
>  at org.apache.kylin.cube.CubeManager.getDictionary(CubeManager.java:271) at 
> org.apache.kylin.cube.CubeSegment.getDictionary(CubeSegment.java:320) at 
> org.apache.kylin.cube.kv.CubeDimEncMap.getDictionary(CubeDimEncMap.java:86) 
> at org.apache.kylin.cube.kv.CubeDimEncMap.get(CubeDimEncMap.java:65) at 
> org.apache.kylin.cube.kv.RowKeyColumnIO.getColumnLength(RowKeyColumnIO.java:43)
>  at org.apache.kylin.cube.kv.RowKeyEncoder.(RowKeyEncoder.java:59) at 
> org.apache.kylin.cube.kv.AbstractRowKeyEncoder.createInstance(AbstractRowKeyEncoder.java:48)
>  at 
> org.apache.kylin.engine.mr.common.BaseCuboidBuilder.(BaseCuboidBuilder.java:84)
>  at 
> org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase.doSetup(BaseCuboidMapperBase.java:70)
>  at 
> org.apache.kylin.engine.mr.steps.HiveToBaseCuboidMapper.doSetup(HiveToBaseCuboidMapper.java:36)
>  at org.apache.kylin.engine.mr.KylinMapper.setup(KylinMapper.java:48) at 
> org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) at 
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:415) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1707)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: 
> java.lang.OutOfMemoryError: Requested array size exceeds VM limit at 
> java.util.Arrays.copyOf(Arrays.java:2271) at 
> java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113) at 
> java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) 
> at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140) at 
> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1793) at 
> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1769) at 
> org.apache.commons.io.IOUtils.copy(IOUtils.java:1744) at 
> org.apache.kylin.common.persistence.FileResourceStore.getResourceImpl(FileResourceStore.java:123)
>  at 
> org.apache.kylin.common.persistence.ResourceStore.getResource(ResourceStore.java:154)
>  at org.apache.kylin.dict.DictionaryManager.load(DictionaryManager.java:418) 
> at org.apache.kylin.dict.DictionaryManager$1.load(DictionaryManager.java:101) 
> at org.apache.kylin.dict.DictionaryManager$1.load(DictionaryManager.java:98) 
> at 
> com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
>  at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) 
> at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
>  at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) at 
> com.google.common.cache.LocalCache.get(LocalCache.java:3965) at 
> com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3969) at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4829) 
> at 
> org.apache.kylin.dict.DictionaryManager.getDictionaryInfo(DictionaryManager.java:118)
>  at org.apache.kylin.cube.CubeManager.getDictionary(CubeManager.java:271) at 
> 

[jira] [Commented] (KYLIN-3431) Avoid FileInputStream/FileOutputStream

2018-07-01 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529071#comment-16529071
 ] 

Shaofeng SHI commented on KYLIN-3431:
-

Thanks Ted!

> Avoid FileInputStream/FileOutputStream
> --
>
> Key: KYLIN-3431
> URL: https://issues.apache.org/jira/browse/KYLIN-3431
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Ted Yu
>Priority: Major
>
> They rely on finalizers (before Java 11), which create unnecessary GC load. 
> The alternatives, {{Files.newInputStream}}, are as easy to use and don't have 
> this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3431) Avoid FileInputStream/FileOutputStream

2018-07-01 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3431:

Fix Version/s: v2.5.0

> Avoid FileInputStream/FileOutputStream
> --
>
> Key: KYLIN-3431
> URL: https://issues.apache.org/jira/browse/KYLIN-3431
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Ted Yu
>Priority: Major
> Fix For: v2.5.0
>
>
> They rely on finalizers (before Java 11), which create unnecessary GC load. 
> The alternatives, {{Files.newInputStream}}, are as easy to use and don't have 
> this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3435) Only keep base cuboid files on HDFS for future merge

2018-07-01 Thread Shaofeng SHI (JIRA)
Shaofeng SHI created KYLIN-3435:
---

 Summary: Only keep base cuboid files on HDFS for future merge
 Key: KYLIN-3435
 URL: https://issues.apache.org/jira/browse/KYLIN-3435
 Project: Kylin
  Issue Type: Improvement
  Components: Job Engine
Reporter: Shaofeng SHI


Today Kylin keeps all cuboids data in HDFS for future merge. When doing the 
merge, Kylin need re-encode the dimension values with the new dictionaries, for 
all cuboids.

 

If we only keep the base cuboid, lots of disk space can be saved. On merge, 
after merge the base cuboid, calculate others from the new base cuboid.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3416) Kylin bitmap null pointer exception

2018-07-01 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529079#comment-16529079
 ] 

Shaofeng SHI commented on KYLIN-3416:
-

Hi Lemont, could you please share the cube definition? Or if you already fixed 
it, welcome to contribute a patch. Thank you!

> Kylin bitmap null pointer exception
> ---
>
> Key: KYLIN-3416
> URL: https://issues.apache.org/jira/browse/KYLIN-3416
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.3.1
>Reporter: Lemont
>Priority: Blocker
>
> Hi,team:
>   Ithink there is a conflict between dimensional aggregation and count 
> distinct.
> For example:
> select
>  (1524931200 - biz_time)/(30*86400),
>  count(DISTINCT id) id
>  from test
> where pt ='20180621'
>  group by (1524931200 - biz_time)/(30*86400)
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.kylin.measure.bitmap.RoaringBitmapCounter.getMutableBitmap(RoaringBitmapCounter.java:58)
>  at 
> org.apache.kylin.measure.bitmap.RoaringBitmapCounter.orWith(RoaringBitmapCounter.java:72)
>  at 
> org.apache.kylin.measure.bitmap.BitmapAggregator.aggregate(BitmapAggregator.java:43)
>  at 
> org.apache.kylin.measure.bitmap.BitmapDistinctCountAggFunc.add(BitmapDistinctCountAggFunc.java:31)
>  
> The problem cased by  GTCubeStorageQueryBase.isNeedStorageAggregation
> There are three dimension int the cube,the sql use cuboid 5 and the 
> cuboid.getColumns is
> pt,bize_time.
> The groupD is bize_time and singleValueD is pt.
> So isNeedStorageAggregation return false.
> But in fact this sql need storage agregation because the group is (1524931200 
> - biz_time)/(30*86400) not only biz_time 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3411) kylin scan different in same sql

2018-07-01 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529081#comment-16529081
 ] 

Shaofeng SHI commented on KYLIN-3411:
-

Interesting finding!

> kylin scan different in same sql
> 
>
> Key: KYLIN-3411
> URL: https://issues.apache.org/jira/browse/KYLIN-3411
> Project: Kylin
>  Issue Type: Improvement
>  Components: Query Engine
>Affects Versions: v2.3.1
>Reporter: Lemont
>Priority: Blocker
>
> There are two sql:
> select sum(value) from test where time > 1524326400 group by id
> and
> select sum(value) from test where time > (1524931200-7*86400) group by id
> As we can see 1524326400 =(1524931200-7*86400) 
> but the second sql query slower than the first sql
> Cuboid Ids: [3904]
> Total scan count: 1157959
> Total scan bytes: 265530668
> Result row count: 34991
> Cuboid Ids: [3904]
> Total scan count: 611795
> Total scan bytes: 140681855
> Result row count: 34991



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KYLIN-3384) Allow setting REPLICATION_SCOPE on newly created tables

2018-07-01 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reassigned KYLIN-3384:
---

Assignee: 翟娜

> Allow setting REPLICATION_SCOPE on newly created tables
> ---
>
> Key: KYLIN-3384
> URL: https://issues.apache.org/jira/browse/KYLIN-3384
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Lars Francke
>Assignee: 翟娜
>Priority: Minor
> Fix For: v2.5.0
>
>
> As far as I can tell all tables are currently created "equal" in 
> `CubeHTableUtil`.
> We'd like to set REPLICATION_SCOPE to 1 on newly created tables so HBase 
> replicates the data to a second cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3166) ODBC Driver could not display Chinese character properly

2018-07-01 Thread cnwangdp (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529334#comment-16529334
 ] 

cnwangdp commented on KYLIN-3166:
-

response Info

[\{TABLE_NAME=币种维度, COLUMN_NAME=币种代码, COLUMN_TYPE=VARCHAR, COLUMN_LENGTH=10, 
COLUMN_PRECISION=0, COLUMN_DESC=}, \{TABLE_NAME=币种维度, COLUMN_NAME=币种名称, 
COLUMN_TYPE=VARCHAR, COLUMN_LENGTH=100, COLUMN_PRECISION=0, COLUMN_DESC=null}, 
\{TABLE_NAME=币种维度, COLUMN_NAME=起停标志, COLUMN_TYPE=VARCHAR, COLUMN_LENGTH=10, 
COLUMN_PRECISION=0, COLUMN_DESC=null}, \{TABLE_NAME=币种维度, COLUMN_NAME=数据加载日期, 
COLUMN_TYPE=DATE, COLUMN_LENGTH=10, COLUMN_PRECISION=0, COLUMN_DESC=null}]

> ODBC Driver could not display Chinese character properly
> 
>
> Key: KYLIN-3166
> URL: https://issues.apache.org/jira/browse/KYLIN-3166
> Project: Kylin
>  Issue Type: Bug
>  Components: Driver - ODBC
>Affects Versions: v2.1.0
> Environment: Win10 ,Excel2016,Kylin ODBC Driver v2.1.0 
>Reporter: cnwangdp
>Assignee: Dong Li
>Priority: Major
> Fix For: all
>
> Attachments: 1.png, 2.png
>
>
> !1.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-2554) Update Chinese docs to latest kylin

2018-07-01 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529410#comment-16529410
 ] 

ASF subversion and git services commented on KYLIN-2554:


Commit 1e9b6dd780993026a53a1c9b82a4687f36c4 in kylin's branch 
refs/heads/document from GinaZhai
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=1e9b6dd ]

KYLIN-2554

Signed-off-by: shaofengshi 


> Update Chinese docs to latest kylin
> ---
>
> Key: KYLIN-2554
> URL: https://issues.apache.org/jira/browse/KYLIN-2554
> Project: Kylin
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: nichunen
>Assignee: 翟娜
>Priority: Major
> Fix For: v2.4.0
>
>
> Currently, Chinese docs of kylin are out of date, this may confuse users, 
> need to be updated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-2554) Update Chinese docs to latest kylin

2018-07-01 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529411#comment-16529411
 ] 

ASF subversion and git services commented on KYLIN-2554:


Commit f56bb1e275c1bbdf1352c87bf13fa31c6675d795 in kylin's branch 
refs/heads/document from shaofengshi
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=f56bb1e ]

KYLIN-2554 Update Chinese docs


> Update Chinese docs to latest kylin
> ---
>
> Key: KYLIN-2554
> URL: https://issues.apache.org/jira/browse/KYLIN-2554
> Project: Kylin
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: nichunen
>Assignee: 翟娜
>Priority: Major
> Fix For: v2.4.0
>
>
> Currently, Chinese docs of kylin are out of date, this may confuse users, 
> need to be updated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-07-01 Thread Temple Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529412#comment-16529412
 ] 

Temple Zhou commented on KYLIN-3430:


[~Shaofengshi]
 Hi Shaofeng, I'd like to contribute. Maybe, you can assign the issue to me. :D 

> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Priority: Major
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3384) Allow setting REPLICATION_SCOPE on newly created tables

2018-07-01 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529110#comment-16529110
 ] 

ASF subversion and git services commented on KYLIN-3384:


Commit 2c08e7f1bd6d341a19d9ae9e2b9ab2afaefed4de in kylin's branch 
refs/heads/master from GinaZhai
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=2c08e7f ]

KYLIN-3384 Allow setting REPLICATION_SCOPE on newly created tables

Signed-off-by: shaofengshi 


> Allow setting REPLICATION_SCOPE on newly created tables
> ---
>
> Key: KYLIN-3384
> URL: https://issues.apache.org/jira/browse/KYLIN-3384
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Lars Francke
>Priority: Minor
> Fix For: v2.5.0
>
>
> As far as I can tell all tables are currently created "equal" in 
> `CubeHTableUtil`.
> We'd like to set REPLICATION_SCOPE to 1 on newly created tables so HBase 
> replicates the data to a second cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3416) Kylin bitmap null pointer exception

2018-07-01 Thread Lemont (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529142#comment-16529142
 ] 

Lemont commented on KYLIN-3416:
---

[~Shaofengshi]

I'm sorry i am the novice of kylin,I don't know how calcite explain the sql.I 
will provide the message for you and try to fix it at a later time.

To make it easy ,I will use the KYLIN_SALES table.

The simple sql:

select count(distinct KYLIN_SALES.PRICE) from KYLIN_SALES  group by 
KYLIN_SALES.LEAF_CATEG_ID/100

I think the case of this error is exactAggregation is true, cuboid id is 1

the exactAggregation case the bitmap only have the size of self but don't have 
the bitmap value

The model definition is:
{
  "uuid": "d4a843df-f135-4038-9d54-f3241ddc491c",
  "last_modified": 1530459283091,
  "version": "2.4.0.20500",
  "name": "kylin_sales_model_testnull",
  "owner": "ADMIN",
  "is_draft": false,
  "description": "",
  "fact_table": "DEFAULT.KYLIN_SALES",
  "lookups": [],
  "dimensions": [
\{
  "table": "KYLIN_SALES",
  "columns": [
"TRANS_ID",
"SELLER_ID",
"BUYER_ID",
"PART_DT",
"LEAF_CATEG_ID",
"LSTG_FORMAT_NAME",
"LSTG_SITE_ID",
"OPS_USER_ID",
"OPS_REGION"
  ]
}
  ],
  "metrics": [
"KYLIN_SALES.PRICE",
"KYLIN_SALES.ITEM_COUNT"
  ],
  "filter_condition": "",
  "partition_desc": \{
"partition_date_column": null,
"partition_time_column": null,
"partition_date_start": 132537600,
"partition_date_format": "-MM-dd",
"partition_time_format": "HH:mm:ss",
"partition_type": "APPEND",
"partition_condition_builder": 
"org.apache.kylin.metadata.model.PartitionDesc$DefaultPartitionConditionBuilder"
  },
  "capacity": "MEDIUM"
}
The cube definition :
{
  "uuid": "b88422cf-2cbf-4cd3-921b-73a927fd09aa",
  "last_modified": 1530459405127,
  "version": "2.4.0.20500",
  "name": "test_bitmap_null_cube",
  "is_draft": false,
  "model_name": "kylin_sales_model_testnull",
  "description": "",
  "null_string": null,
  "dimensions": [
\{
  "name": "LEAF_CATEG_ID",
  "table": "KYLIN_SALES",
  "column": "LEAF_CATEG_ID",
  "derived": null
}
  ],
  "measures": [
\{
  "name": "_COUNT_",
  "function": {
"expression": "COUNT",
"parameter": {
  "type": "constant",
  "value": "1"
},
"returntype": "bigint"
  }
},
\{
  "name": "PRICE",
  "function": {
"expression": "COUNT_DISTINCT",
"parameter": {
  "type": "column",
  "value": "KYLIN_SALES.PRICE"
},
"returntype": "bitmap"
  }
}
  ],
  "dictionaries": [
\{
  "column": "KYLIN_SALES.PRICE",
  "builder": "org.apache.kylin.dict.GlobalDictionaryBuilder"
}
  ],
  "rowkey": \{
"rowkey_columns": [
  {
"column": "KYLIN_SALES.LEAF_CATEG_ID",
"encoding": "dict",
"encoding_version": 1,
"isShardBy": false
  }
]
  },
  "hbase_mapping": \{
"column_family": [
  {
"name": "F1",
"columns": [
  {
"qualifier": "M",
"measure_refs": [
  "_COUNT_"
]
  }
]
  },
  \{
"name": "F2",
"columns": [
  {
"qualifier": "M",
"measure_refs": [
  "PRICE"
]
  }
]
  }
]
  },
  "aggregation_groups": [
\{
  "includes": [
"KYLIN_SALES.LEAF_CATEG_ID"
  ],
  "select_rule": {
"hierarchy_dims": [],
"mandatory_dims": [],
"joint_dims": []
  }
}
  ],
  "signature": "TVE29H3H4ElAm/4pEljkOQ==",
  "notify_list": [],
  "status_need_notify": [
"ERROR",
"DISCARDED",
"SUCCEED"
  ],
  "partition_date_start": 0,
  "partition_date_end": 31536,
  "auto_merge_time_ranges": [
60480,
241920
  ],
  "volatile_range": 0,
  "retention_range": 0,
  "engine_type": 2,
  "storage_type": 2,
  "override_kylin_properties": {},
  "cuboid_black_list": [],
  "parent_forward": 3,
  "mandatory_dimension_set_list": [],
  "snapshot_table_desc_list": []
}

> Kylin bitmap null pointer exception
> ---
>
> Key: KYLIN-3416
> URL: https://issues.apache.org/jira/browse/KYLIN-3416
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.3.1
>Reporter: Lemont
>Priority: Blocker
>
> Hi,team:
>   Ithink there is a conflict between dimensional aggregation and count 
> distinct.
> For example:
> select
>  (1524931200 - biz_time)/(30*86400),
>  count(DISTINCT id) id
>  from test
> where pt ='20180621'
>  group by (1524931200 - biz_time)/(30*86400)
> Caused by: java.lang.NullPointerException
>  at 
> 

[jira] [Comment Edited] (KYLIN-3416) Kylin bitmap null pointer exception

2018-07-01 Thread Lemont (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529142#comment-16529142
 ] 

Lemont edited comment on KYLIN-3416 at 7/1/18 4:15 PM:
---

[~Shaofengshi]

I'm sorry i am the novice of kylin,I don't know how calcite explain the sql.I 
will provide the message for you and try to fix it at a later time.

To make it easy ,I will use the KYLIN_SALES table.

The simple sql:
{quote}select count(distinct KYLIN_SALES.PRICE) from KYLIN_SALES  group by 
KYLIN_SALES.LEAF_CATEG_ID/100
{quote}
I think the case of this error is exactAggregation is true, cuboid id is 1

The exactAggregation case the bitmap only have the size of self but don't have 
the bitmap value

The model definition is:
{quote}{ "uuid": "d4a843df-f135-4038-9d54-f3241ddc491c", "last_modified": 
1530459283091, "version": "2.4.0.20500", "name": "kylin_sales_model_testnull", 
"owner": "ADMIN", "is_draft": false, "description": "", "fact_table": 
"DEFAULT.KYLIN_SALES", "lookups": [], "dimensions": [ \\{ "table": 
"KYLIN_SALES", "columns": [ "TRANS_ID", "SELLER_ID", "BUYER_ID", "PART_DT", 
"LEAF_CATEG_ID", "LSTG_FORMAT_NAME", "LSTG_SITE_ID", "OPS_USER_ID", 
"OPS_REGION" ] }

],
 "metrics": [
 "KYLIN_SALES.PRICE",
 "KYLIN_SALES.ITEM_COUNT"
 ],
 "filter_condition": "",
 "partition_desc": {
 "partition_date_column": null,
 "partition_time_column": null,
 "partition_date_start": 132537600,
 "partition_date_format": "-MM-dd",
 "partition_time_format": "HH:mm:ss",
 "partition_type": "APPEND",
 "partition_condition_builder": 
"org.apache.kylin.metadata.model.PartitionDesc$DefaultPartitionConditionBuilder"
 },
 "capacity": "MEDIUM"
 }
{quote}
The cube definition :
{quote}{ "uuid": "b88422cf-2cbf-4cd3-921b-73a927fd09aa", "last_modified": 
1530459405127, "version": "2.4.0.20500", "name": "test_bitmap_null_cube", 
"is_draft": false, "model_name": "kylin_sales_model_testnull", "description": 
"", "null_string": null, "dimensions": [ \\{ "name": "LEAF_CATEG_ID", "table": 
"KYLIN_SALES", "column": "LEAF_CATEG_ID", "derived": null }

],
 "measures": [
 {
 "name": "_COUNT_",
 "function": {
 "expression": "COUNT",
 "parameter":

{ "type": "constant", "value": "1" }

,
 "returntype": "bigint"
 }
 },
 {
 "name": "PRICE",
 "function": {
 "expression": "COUNT_DISTINCT",
 "parameter":

{ "type": "column", "value": "KYLIN_SALES.PRICE" }

,
 "returntype": "bitmap"
 }
 }
 ],
 "dictionaries": [
 {
 "column": "KYLIN_SALES.PRICE",
 "builder": "org.apache.kylin.dict.GlobalDictionaryBuilder"
 }
 ],
 "rowkey": {
 "rowkey_columns": [

{ "column": "KYLIN_SALES.LEAF_CATEG_ID", "encoding": "dict", 
"encoding_version": 1, "isShardBy": false }

]
 },
 "hbase_mapping": {
 "column_family": [
 {
 "name": "F1",
 "columns": [

{ "qualifier": "M", "measure_refs": [ "_COUNT_" ] }

]
 },
 {
 "name": "F2",
 "columns": [

{ "qualifier": "M", "measure_refs": [ "PRICE" ] }

]
 }
 ]
 },
 "aggregation_groups": [
 {
 "includes": [
 "KYLIN_SALES.LEAF_CATEG_ID"
 ],
 "select_rule":

{ "hierarchy_dims": [], "mandatory_dims": [], "joint_dims": [] }

}
 ],
 "signature": "TVE29H3H4ElAm/4pEljkOQ==",
 "notify_list": [],
 "status_need_notify": [
 "ERROR",
 "DISCARDED",
 "SUCCEED"
 ],
 "partition_date_start": 0,
 "partition_date_end": 31536,
 "auto_merge_time_ranges": [
 60480,
 241920
 ],
 "volatile_range": 0,
 "retention_range": 0,
 "engine_type": 2,
 "storage_type": 2,
 "override_kylin_properties": {},
 "cuboid_black_list": [],
 "parent_forward": 3,
 "mandatory_dimension_set_list": [],
 "snapshot_table_desc_list": []
 }
{quote}


was (Author: lemontsr):
[~Shaofengshi]

I'm sorry i am the novice of kylin,I don't know how calcite explain the sql.I 
will provide the message for you and try to fix it at a later time.

To make it easy ,I will use the KYLIN_SALES table.

The simple sql:

select count(distinct KYLIN_SALES.PRICE) from KYLIN_SALES  group by 
KYLIN_SALES.LEAF_CATEG_ID/100

I think the case of this error is exactAggregation is true, cuboid id is 1

the exactAggregation case the bitmap only have the size of self but don't have 
the bitmap value

The model definition is:
{
  "uuid": "d4a843df-f135-4038-9d54-f3241ddc491c",
  "last_modified": 1530459283091,
  "version": "2.4.0.20500",
  "name": "kylin_sales_model_testnull",
  "owner": "ADMIN",
  "is_draft": false,
  "description": "",
  "fact_table": "DEFAULT.KYLIN_SALES",
  "lookups": [],
  "dimensions": [
\{
  "table": "KYLIN_SALES",
  "columns": [
"TRANS_ID",
"SELLER_ID",
"BUYER_ID",
"PART_DT",
"LEAF_CATEG_ID",
"LSTG_FORMAT_NAME",
"LSTG_SITE_ID",
"OPS_USER_ID",
"OPS_REGION"
  ]
}
  ],
  "metrics": [
"KYLIN_SALES.PRICE",
"KYLIN_SALES.ITEM_COUNT"
  ],
  "filter_condition": "",
  "partition_desc": \{
"partition_date_column": null,
"partition_time_column": null,
"partition_date_start": 132537600,