[jira] [Created] (KYLIN-2496) Table snapshot should be no greater than 300MB
Kailun Zhang created KYLIN-2496: --- Summary: Table snapshot should be no greater than 300MB Key: KYLIN-2496 URL: https://issues.apache.org/jira/browse/KYLIN-2496 Project: Kylin Issue Type: Bug Affects Versions: v1.5.2 Reporter: Kailun Zhang Fix For: v1.5.2 my fact table has 1000w terms,and join with look up table by userid,the look up table has 600w terms, I set the colums gender as dimension to build the cube,failed caused by java.lang.IllegalStateException:Table snapshot should be no greater than 300 MB,but TableDesc[database=mydatabase name=my table name] size is 1442042137. could kylin affords the high cardinality dimension to join? how can i resolve the promblem and biuld the cube,thanks! -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Re: build cube with spark ERROR
Spark didn't find mysql connector jar on classpath; Check : https://stackoverflow.com/questions/33192886/com-mysql-jdbc-driver-not-found-on-classpath-while-starting-spark-sql-and-thrift You can add additional spark jar in kylin.properties, e.g.,: kylin.engine.spark.additional-jars=/path/to/mysql-connector-java-5.1.38-bin.jar 2017-03-10 11:37 GMT+08:00 仇同心: > Hi all, > >When built cube with spark ,I met some errors, Seems to be linked to > the hive seems to be linked to the hive,can you help me? > > > > > > > > javax.jdo.JDOFatalInternalException: Error creating transactional > connection factory > > at org.datanucleus.api.jdo.NucleusJDOHelper. > getJDOExceptionForNucleusException(NucleusJDOHelper.java:587) > > at org.datanucleus.api.jdo.JDOPersistenceManagerFactory. > freezeConfiguration(JDOPersistenceManagerFactory.java:788) > > at org.datanucleus.api.jdo.JDOPersistenceManagerFactory. > getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:57) > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > > at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) > > at javax.jdo.JDOHelper.invokeGetPersistenceManagerFac > toryOnImplementation(JDOHelper.java:1166) > > at javax.jdo.JDOHelper.getPersistenceManagerFactory( > JDOHelper.java:808) > > at javax.jdo.JDOHelper.getPersistenceManagerFactory( > JDOHelper.java:701) > > at org.apache.hadoop.hive.metastore.ObjectStore.getPMF( > ObjectStore.java:365) > > at org.apache.hadoop.hive.metastore.ObjectStore. > getPersistenceManager(ObjectStore.java:394) > > at org.apache.hadoop.hive.metastore.ObjectStore. > initialize(ObjectStore.java:291) > > at org.apache.hadoop.hive.metastore.ObjectStore.setConf( > ObjectStore.java:258) > > at org.apache.hadoop.util.ReflectionUtils.setConf( > ReflectionUtils.java:73) > > at org.apache.hadoop.util.ReflectionUtils.newInstance( > ReflectionUtils.java:133) > > at org.apache.hadoop.hive.metastore.RawStoreProxy. > (RawStoreProxy.java:57) > > at org.apache.hadoop.hive.metastore.RawStoreProxy. > getProxy(RawStoreProxy.java:66) > > at org.apache.hadoop.hive.metastore.HiveMetaStore$ > HMSHandler.newRawStore(HiveMetaStore.java:593) > > at org.apache.hadoop.hive.metastore.HiveMetaStore$ > HMSHandler.getMS(HiveMetaStore.java:571) > > at org.apache.hadoop.hive.metastore.HiveMetaStore$ > HMSHandler.createDefaultDB(HiveMetaStore.java:620) > > at org.apache.hadoop.hive.metastore.HiveMetaStore$ > HMSHandler.init(HiveMetaStore.java:461) > > at org.apache.hadoop.hive.metastore.RetryingHMSHandler.< > init>(RetryingHMSHandler.java:66) > > at org.apache.hadoop.hive.metastore.RetryingHMSHandler. > getProxy(RetryingHMSHandler.java:72) > > at org.apache.hadoop.hive.metastore.HiveMetaStore. > newRetryingHMSHandler(HiveMetaStore.java:5762) > > at org.apache.hadoop.hive.metastore.HiveMetaStoreClient. > (HiveMetaStoreClient.java:199) > > at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.< > init>(SessionHiveMetaStoreClient.java:74) > > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > > at sun.reflect.NativeConstructorAccessorImpl.newInstance( > NativeConstructorAccessorImpl.java:57) > > at sun.reflect.DelegatingConstructorAccessorImpl.newInstance( > DelegatingConstructorAccessorImpl.java:45) > > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > > at org.apache.hadoop.hive.metastore.MetaStoreUtils. > newInstance(MetaStoreUtils.java:1521) > > at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient. > (RetryingMetaStoreClient.java:86) > > at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient. > getProxy(RetryingMetaStoreClient.java:132) > > at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient. > getProxy(RetryingMetaStoreClient.java:104) > > at org.apache.hadoop.hive.ql.metadata.Hive. > createMetaStoreClient(Hive.java:3005) > > at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024) > > at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases( > Hive.java:1234) > > at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions( > Hive.java:174) > > at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:166) > > at org.apache.hadoop.hive.ql.session.SessionState.start( > SessionState.java:503) > > at
[jira] [Created] (KYLIN-2495) query exception when integer column encoded as date/time encoding
hongbin ma created KYLIN-2495: - Summary: query exception when integer column encoded as date/time encoding Key: KYLIN-2495 URL: https://issues.apache.org/jira/browse/KYLIN-2495 Project: Kylin Issue Type: Bug Reporter: hongbin ma Assignee: hongbin ma in KYLIN-, we claimed that integer column can use date/time encoding. however when I tried to query on such cube, an exception is thrown: {code} java.sql.SQLException: Error while executing SQL "select * from fact0309 LIMIT 5": For input string: "70225920" {code} the fact table desc is: {code} hive> desc fact0309 > ; OK tdate int country string price decimal(10,0) {code} and the sample data is: {code} 19980302US 100 19920403CN 100 19920403US 33 {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Question Regrading Cube Query Time
Hello, I am doing a POC on kylin Cubes, I have built a Cube on TPC-DS data (~40GB). The build was successful, but i am facing issues with queries. Simple aggregation queries are returning results in sub seconds, but queries with order by/group by taking too much time. In first place, queries were failing with timeout error because of records scan threshold, i then increased "kylin.query.scan.threshold" value in kylin.properties. The threshold error got fixed, but queries were taking around 200 sec. Which is totally not acceptable because HIVE was returning result in 10 seconds for the same query. I am attaching one of the query(standard TPC-DS query q3) i am trying to run, SELECT date_dim.d_year,item.i_brand_id, item.i_brand,sum(facttable.ss_ext_discount_amt) sum_agg FROM store_sales facttableINNER JOIN date_dim date_dim ON (facttable.ss_sold_date_sk = date_dim.d_date_sk)INNER JOIN item item ON (facttable.ss_item_sk = item.i_item_sk) WHERE item.i_manufact_id = 783 and date_dim.d_moy = 11 GROUP BY date_dim.d_year, item.i_brand,item.i_brand_id ORDER BY date_dim.d_year,sum_agg DESC,item.i_brand_idLIMIT 100; My cluster details are,10 nodes(each node has 32 cores, 64GB RAM) with hdp 2.5HBase 1.1.2.2.5.3.0-37 (fully distributed mode) Just to investigate, i checked region server logs of all the nodes and found that during query execution only one region server was doing all the work while others were idle. And, my Cube's Hbase table was also showing 1 region count, So i tried changing following properties but still no luck. kylin.hbase.hfile.size.gb=1kylin.hbase.region.count.min=8 Please let me know, if there is any other configuration needed in order to fix large query time. Thanks
[jira] [Created] (KYLIN-2494) Model has no dup column on dimensions and measures
liyang created KYLIN-2494: - Summary: Model has no dup column on dimensions and measures Key: KYLIN-2494 URL: https://issues.apache.org/jira/browse/KYLIN-2494 Project: Kylin Issue Type: Improvement Reporter: liyang Assignee: liyang It does not make sense that a column appears as both dimension and measure in a model. A column must be either dimension or measure. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (KYLIN-2493) BufferOverflowException in FactDistinctColumnsMapper when a value is very long
XIE FAN created KYLIN-2493: -- Summary: BufferOverflowException in FactDistinctColumnsMapper when a value is very long Key: KYLIN-2493 URL: https://issues.apache.org/jira/browse/KYLIN-2493 Project: Kylin Issue Type: Bug Reporter: XIE FAN Assignee: XIE FAN error stack: Error: java.nio.BufferOverflowException at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:183) at java.nio.ByteBuffer.put(ByteBuffer.java:832) at org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper.doMap(FactDistinctColumnsMapper.java:157) at org.apache.kylin.engine.mr.KylinMapper.map(KylinMapper.java:48) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) -- This message was sent by Atlassian JIRA (v6.3.15#6346)