Encountered some problems when querying data

2017-10-16 Thread feng
Hello,dev 1,When using the ‘like’query in sql, I found a bug. E.g: select ake005,count(1) from ca_08_kc22 where ake005 like '33011%' and akc225>10 group by ake005; Get sample data: ++--+--+ | ake005 | _c1 | ++--+--+ | 33011 | 1

答复: insert carbondata table failed

2017-09-18 Thread feng
Thank you , I have tried to resolve this issue by making changes in the spark configuration and use two fields as DICTIONARY_INCLUDE. test data(30G) load 8 times, each time about 1.5 minutes to complete Is currently testing another larger data, hope to be successful, thank you very much for t

答复: insert carbondata table failed

2017-09-18 Thread feng
Sorry, A total of 4 nodes . of which 3 as datanode and snn on one of the datanodes. Version: Carbondata 1.1.0 Spark 1.6.0 Hadoop :2.7.2 Thank you for your help , I'm trying again = Liu feng -邮件原件- 发件人: ravipesala [mailto:ravi.pes...@gmail.com] 发送时间: 2017年9月19日

insert carbondata table failed

2017-09-18 Thread feng
Hi,community: It inserts records from a source table into a target CarbonData table(kc22_ca). The source table can be a Hive table(‘kc22_p1’). kc22_p1 records : 102200946 51.5 G Stage: spark-shell --master yarn-client --driver-memory 20G --executor-cores 1 --num-executors 12 --executor-mem

carbondata 加载数据问题咨询

2017-09-16 Thread feng
您好, 最近研究carbondata,在加载数据时遇到几个问题: 1,load 数据量超过10G,在collect at GlobalDictionaryUtil.scala:746 报错,导致无法进行 2,5G以内数据,往新建的表中insert时,一两分钟就可以成功,但是按照增量的方式 insert时会很慢,大约三十分钟。 以上,请问有什么优化的办法吗?谢谢!!! 配置:集群三个 数据节点,配置