)
at org.joda.time.DateTimeZone.convertUTCToLocal(DateTimeZone.java:925)
Any ideas?
Thanks
From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: May-11-16 5:32 PM
To: Younes Naguib
Cc: user@spark.apache.org
Subject: Re: kryo
Have you seen this thread ?
http://search-hadoop.com/m/q3RTtpO0qI3cp06/JodaDateTimeSerializer+spark
it, and register it in the spark-default conf.
Thanks,
Younes Naguib <mailto:younes.nag...@streamtheworld.com>
Hi all,
When parquet 2.0 planned in Spark?
Or is it already?
Younes Naguib
Triton Digital | 1440 Ste-Catherine W., Suite 1200 | Montreal, QC H3G 1R8
Tel.: +1 514 448 4037 x2688 | Tel.: +1 866 448 4037 x2688 |
younes.nag...@tritondigital.com <mailto:younes.nag...@streamtheworld.com>
Anyways to cache the subquery or force a broadcast join without persisting it?
y
From: Michael Armbrust [mailto:mich...@databricks.com]
Sent: March-17-16 8:59 PM
To: Younes Naguib
Cc: user@spark.apache.org
Subject: Re: Subquery performance
Try running EXPLAIN on both version of the query
Hi all,
I'm running a query that looks like the following:
Select col1, count(1)
>From (Select col2, count(1) from tab2 group by col2)
Inner join tab1 on (col1=col2)
Group by col1
This creates a very large shuffle, 10 times the data size, as if the subquery
was executed for each row.
Anything
Hi all,
Since 1.6.0, low latency query are much slower now.
This seems to be connected to the multi-user in the thrift-server.
So on any newly created session, jobs are added to fill the session cache with
information related to the tables it queries.
Here is the details for this job:
load at
Hi,
I'm running CTAS, and it fails with "Error: java.lang.AssertionError: assertion
failed: No plan for CreateTableAsSelect HiveTable"
Here what my sql looks like :
Create tbl (
Col1 timestamp ,
Col2 string,
Col3 int,
.
)
partitioned by (year int, month int, day
SQL on beeline and connecting to the thriftserver.
Younes
From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: January-26-16 11:05 AM
To: Younes Naguib
Cc: user@spark.apache.org
Subject: Re: ctas fails with "No plan for CreateTableAsSelect"
Were you using HiveContext or SQLContext ?
Ca
11:11 AM
To: Younes Naguib
Cc: user@spark.apache.org
Subject: Re: ctas fails with "No plan for CreateTableAsSelect"
Maybe try enabling the following (false by default):
"spark.sql.hive.convertCTAS"
doc = "When true, a table created by a Hive CTAS s
Patil [mailto:tejas.patil...@gmail.com]
Sent: January-26-16 11:39 AM
To: Younes Naguib
Cc: user@spark.apache.org
Subject: Re: ctas fails with "No plan for CreateTableAsSelect"
In CTAS, you should not specify the column information as it is derived from
the result of SELECT statement.
: Younes Naguib [mailto:younes.nag...@tritondigital.com]
Sent: January-26-16 11:42 AM
To: 'Tejas Patil'
Cc: user@spark.apache.org
Subject: RE: ctas fails with "No plan for CreateTableAsSelect"
The destination table is partitioned. If I don’t specify the columns I get :
Hi all,
I'm connected to the thrift server using beeline on Spark 1.6.
I used : cache table tbl as select * from table1
I see table1 in the storage memory. I can use it. But when I reconnect, I cant
quert it anymore.
I get : Error: org.apache.spark.sql.AnalysisException: Table not found:
Hi all,
I get this error when running "show current roles;" :
2015-12-15 15:50:41 WARN org.apache.hive.service.cli.thrift.ThriftCLIService
ThriftCLIService:681 - Error fetching results:
org.apache.hive.service.cli.HiveSQLException: Couldn't find log associated with
operation handle:
The one coming with spark 1.5.2.
y
From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: December-15-15 1:59 PM
To: Younes Naguib
Cc: user@spark.apache.org
Subject: Re: Securing objects on the thrift server
Which Hive release are you using ?
Please take a look at HIVE-8529
Cheers
On Tue, Dec 15
Hi all,
Is there any documentation on how to setup metastore_db on MySQL in Spark?
I did find a load of information, but they all seems to be some "hack" for
spark.
Thanks
Younes
new ()
}.toDF()
myDF.registerTempTable("tbl")
sqlContext.sql("select count(1) from tbl").collect()
Any help/idea?
Thanks,
Younes Naguib
Triton Digital | 1440 Ste-Catherine W., Suite 1200 | Montreal, QC H3G 1R8
Tel.: +1 514 448 4037 x2688 | Tel.: +1
Hi all,
I'm running a spark shell: bin/spark-shell --executor-memory 32G
--driver-memory 8G
I keep getting :
15/10/30 13:41:59 WARN MemoryManager: Total allocation exceeds
95.00% (2,147,483,647 bytes) of heap memory
Any help ?
Thanks,
Younes Naguib
Triton Digital | 1440 Ste
Hi all,
I use the thrift server, and I cache a table using "cache table mytab".
Is there any sql to broadcast it too?
Thanks
Younes Naguib
Triton Digital | 1440 Ste-Catherine W., Suite 1200 | Montreal, QC H3G 1R8
Tel.: +1 514 448 4037 x2688 | Tel.: +1 866 448 4037 x2688 |
Hi all,
Anyone has any experience with SuccinctRDD?
Thanks,
Younes
Hi all
I'm running sqls on spark 1.5.1 and using tables based on parquets.
My tables are not pruned when joined on partition columns.
Ex:
Select from tab where partcol=1 will prune on value 1
Select from tab join dim on (dim.partcol=tab.partcol) where dim.partcol=1
will scan all
Thanks,
Do you have a Jira I can follow for this?
y
From: Michael Armbrust [mailto:mich...@databricks.com]
Sent: October-16-15 2:18 PM
To: Younes Naguib
Cc: user@spark.apache.org
Subject: Re: Dynamic partition pruning
We don't support dynamic partition pruning yet.
On Fri, Oct 16, 2015 at 10
Hi,
This feature was added in Hive 1.3.
https://issues.apache.org/jira/browse/HIVE-9152
Any idea when this would be in Spark? Or is it already?
Any work around in spark 1.5.1?
Thanks,
Younes
Hi,
We've been using the JDBC thrift server for a couple of weeks now and running
queries on it like a regular RDBMS.
We're about to deploy it in a shared production cluster.
Any advice, warning on a such setup. Yarn or Mesos?
How about dynamic resource allocation in a already running thrift
The TSV original files is 600GB and generated 40k files of 15-25MB.
y
From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: October-07-15 3:18 PM
To: Younes Naguib; 'user@spark.apache.org'
Subject: Re: Parquet file size
Why do you want larger files? Doesn't the result Parquet file contain all
Well, I only have data for 2015-08. So, in the end, only 31 partitions
What I'm looking for, is some reasonably sized partitions.
In any case, just the idea of controlling the output parquet files size or
number would be nice.
Younes Naguib Streaming Division
Triton Digital | 1440 Ste
,
Younes Naguib
Triton Digital | 1440 Ste-Catherine W., Suite 1200 | Montreal, QC H3G 1R8
Tel.: +1 514 448 4037 x2688 | Tel.: +1 866 448 4037 x2688 |
younes.nag...@tritondigital.com <mailto:younes.nag...@streamtheworld.com>
Hi,
We're using a spark thrift server and we connect using jdbc to run queries.
Every time we run a set query, like "set schema", it seems to affect the
server, and not the session only.
Is that an expected behavior? Or am I missing something.
Younes Naguib
Triton Digital | 1440 Ste
27 matches
Mail list logo