Hi Libis,
spark-sql CLI is not supported by carbondata.
Why don't you use carbon thrift server and beeline, it is also same as
spark-sql CLI and it gives execution time for each query.
Start carbondata thrift server script.
bin/spark-submit --class
OK, Thanks for your answer.
The major logic seems to be the same.
However on my machine, carbon costs 3-4 times than orc when grouping by
field.
I will try some solutions on presto like concurrency and improve my hardware
for test.
--
View this message in context:
Hi
Now i can use carbondata 1.0.0 with spark-shell(spark 2.1) as:
./bin/spark-shell --jars
but it's inconvenient to get the query time , so i try to use
./bin/spark-sql --jars ,but i found some
errors when create table :
spark-sql> create table if not exists test_table(id string, name
Hi,
This exception is actually ignored in class SegmentUpdateStatusManager line
number 696. This exception does not create any problem. Usually this
exception won't be printed in any server logs as we are ignoring it. May be
in spark-shell it is printing. we will look into it.
Regards,
Ravindra.
Hi,
The performance is depends on the query plan, when you submit the query
like [Select attributeA , count(*) from tableB group by attributeA] in
case of spark it asks carbon to give only attributeA column. So Carbon
reads only attributeA column from all files send the result to spark to
GitHub user PallaviSingh1992 opened a pull request:
https://github.com/apache/incubator-carbondata-site/pull/14
Fixed the Image not displaying in File Structure
You can merge this pull request into a Git repository by running:
$ git pull
anubhav tarar created CARBONDATA-699:
Summary: using column group column name in dictionary_exclude do
not give any exception
Key: CARBONDATA-699
URL: https://issues.apache.org/jira/browse/CARBONDATA-699