, 2014 at 1:24 AM, Jianshi Huang jianshi.hu...@gmail.com
wrote:
Yes, looks like it can only be controlled by the
parameter spark.sql.autoBroadcastJoinThreshold, which is a little bit weird
to me.
How am I suppose to know the exact bytes of a table? Let me specify the
join algorithm is preferred I
, Jianshi Huang jianshi.hu...@gmail.com
wrote:
Looks like https://issues.apache.org/jira/browse/SPARK-1800 is not merged
into master?
I cannot find spark.sql.hints.broadcastTables in latest master, but it's
in the following patch.
https://github.com/apache/spark/commit
sql(ddl)
setConf(spark.sql.hive.convertMetastoreParquet, true)
}
You'll also need to run this to populate the statistics:
ANALYZE TABLE tableName COMPUTE STATISTICS noscan;
On Wed, Oct 8, 2014 at 1:44 AM, Jianshi Huang jianshi.hu...@gmail.com
wrote:
Ok, currently there's cost-based
Hmm... it failed again, just lasted a little bit longer.
Jianshi
On Mon, Oct 13, 2014 at 4:15 PM, Jianshi Huang jianshi.hu...@gmail.com
wrote:
https://issues.apache.org/jira/browse/SPARK-3106
I'm having the saming errors described in SPARK-3106 (no other types of
errors confirmed), running
Turned out it was caused by this issue:
https://issues.apache.org/jira/browse/SPARK-3923
Set spark.akka.heartbeat.interval to 100 solved it.
Jianshi
On Mon, Oct 13, 2014 at 4:24 PM, Jianshi Huang jianshi.hu...@gmail.com
wrote:
Hmm... it failed again, just lasted a little bit longer.
Jianshi
On Tue, Oct 14, 2014 at 4:36 AM, Jianshi Huang jianshi.hu...@gmail.com
wrote:
Turned out it was caused by this issue:
https://issues.apache.org/jira/browse/SPARK-3923
Set spark.akka.heartbeat.interval to 100 solved it.
Jianshi
On Mon, Oct 13, 2014 at 4:24 PM, Jianshi Huang jianshi.hu
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github Blog: http://huangjs.github.com/
Ah I see. Thanks Hao! I'll wait for the fix.
Jianshi
On Mon, Oct 27, 2014 at 4:57 PM, Cheng, Hao hao.ch...@intel.com wrote:
Hive-thriftserver module is not included while specifying the profile
hive-0.13.1.
-Original Message-
From: Jianshi Huang [mailto:jianshi.hu...@gmail.com
)
at
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
Using the same DDL and Analyze script above.
Jianshi
On Sat, Oct 11, 2014 at 2:18 PM, Jianshi Huang jianshi.hu...@gmail.com
wrote:
It works fine, thanks for the help Michael.
Liancheng
/usr/lib/hive/lib doesn’t show any of the parquet
jars, but ls /usr/lib/impala/lib shows the jar we’re looking for as
parquet-hive-1.0.jar
Is it removed from latest Spark?
Jianshi
On Wed, Nov 26, 2014 at 2:13 PM, Jianshi Huang jianshi.hu...@gmail.com
wrote:
Hi,
Looks like the latest SparkSQL
using latest Spark built from master HEAD yesterday. Is this a bug?
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github Blog: http://huangjs.github.com/
Actually my HADOOP_CLASSPATH has already been set to include
/etc/hadoop/conf/*
export
HADOOP_CLASSPATH=/etc/hbase/conf/hbase-site.xml:/usr/lib/hbase/lib/hbase-protocol.jar:$(hbase
classpath)
Jianshi
On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang jianshi.hu...@gmail.com
wrote:
Looks like
Looks like the datanucleus*.jar shouldn't appear in the hdfs path in
Yarn-client mode.
Maybe this patch broke yarn-client.
https://github.com/apache/spark/commit/a975dc32799bb8a14f9e1c76defaaa7cfbaf8b53
Jianshi
On Fri, Dec 5, 2014 at 12:02 PM, Jianshi Huang jianshi.hu...@gmail.com
wrote
Correction:
According to Liancheng, this hotfix might be the root cause:
https://github.com/apache/spark/commit/38cb2c3a36a5c9ead4494cbc3dde008c2f0698ce
Jianshi
On Fri, Dec 5, 2014 at 12:45 PM, Jianshi Huang jianshi.hu...@gmail.com
wrote:
Looks like the datanucleus*.jar shouldn't appear
I created a ticket for this:
https://issues.apache.org/jira/browse/SPARK-4757
Jianshi
On Fri, Dec 5, 2014 at 1:31 PM, Jianshi Huang jianshi.hu...@gmail.com
wrote:
Correction:
According to Liancheng, this hotfix might be the root cause:
https://github.com/apache/spark/commit
-most
among the inner joins;
DESC EXTENDED tablename; -- this will print the detail information for the
statistic table size (the field “totalSize”)
EXPLAIN EXTENDED query; -- this will print the detail physical plan.
Let me know if you still have problem.
Hao
*From:* Jianshi Huang
Hi,
I got exception saying Hive: NoSuchObjectException(message:table table
not found)
when running DROP TABLE IF EXISTS table
Looks like a new regression in Hive module.
Anyone can confirm this?
Thanks,
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github Blog: http
With Liancheng's suggestion, I've tried setting
spark.sql.hive.convertMetastoreParquet false
but still analyze noscan return -1 in rawDataSize
Jianshi
On Fri, Dec 5, 2014 at 3:33 PM, Jianshi Huang jianshi.hu...@gmail.com
wrote:
If I run ANALYZE without NOSCAN, then Hive can successfully
sql(select cre_ts from pmt limit 1).collect
res16: Array[org.apache.spark.sql.Row] = Array([null])
I created a JIRA for it:
https://issues.apache.org/jira/browse/SPARK-4781
Jianshi
On Sun, Dec 7, 2014 at 1:06 AM, Jianshi Huang jianshi.hu...@gmail.com
wrote:
Hmm... another issue I found
, 2014 at 8:28 PM, Jianshi Huang jianshi.hu...@gmail.com
wrote:
Ok, found another possible bug in Hive.
My current solution is to use ALTER TABLE CHANGE to rename the column
names.
The problem is after renaming the column names, the value of the columns
became all NULL.
Before renaming
FYI,
Latest hive 0.14/parquet will have column renaming support.
Jianshi
On Wed, Dec 10, 2014 at 3:37 AM, Michael Armbrust mich...@databricks.com
wrote:
You might also try out the recently added support for views.
On Mon, Dec 8, 2014 at 9:31 PM, Jianshi Huang jianshi.hu...@gmail.com
wrote
21 matches
Mail list logo