Hi,
How can I split pair rdd [K, V] to map [K, Array(V)] efficiently in Pyspark?
Best,
Patcharee
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
Hi,
How dataframe (What API) can access hive complex type (Struct, Array, Maps)?
Thanks,
Patcharee
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
Hi,
How to write partitioned orc file using OrcNewOutputFormat in MapReduce?
Thanks
Patcharee
Hi,
How can I override log4j level by using --hiveconf? I want to use ERROR
level for some tasks.
Thanks,
Patcharee
Hi,
The query result>
11236119012.64043-5.9708868.5592070.0 0.0
0.0-19.6869931308.804799848.00.006196644 0.00.0
301.274750.382470460.0NULL11 20081
11236122012.513598-6.36717137.3927946 0.0
Hi,
The query result>
11236119012.64043-5.9708868.5592070.0 0.0
0.0-19.6869931308.804799848.00.006196644 0.00.0
301.274750.382470460.0NULL11 20081
11236122012.513598-6.36717137.3927946 0.0
Hi,
I have a table partitioned by a, b, c, d column. I want to alter
concatenate this table. Is it possible to use wildcard in alter command
to alter several partitions at a time? For ex.
alter table TestHive partition (a=1, b=*, c=2, d=*) CONCATENATE;
BR,
Patcharee
Hi,
I am using spark 1.4. I wanted to serialize by KryoSerializer, but got
ClassNotFoundException. The configuration and exception is below. When I
submitted the job, I also provided --jars mylib.jar which contains
WRFVariableZ.
conf.set("spark.serializer", "org.apache.spark.serializer.KryoS
Hi,
How can I know the size of memory needed for each executor (one core) to
execute each job? If there are many cores per executors, will the memory
be the multiplication (memory needed for each executor (one core) * no.
of cores)?
Any suggestions/guidelines?
BR,
Patcharee
---
://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables
Hope this helps,
Will
On June 13, 2015, at 3:36 PM, pth001 wrote:
Hi,
I am using spark 0.14. I try to insert data into a hive table (in orc
format) from DF.
partitionedTestDF.write.format
Hi,
I am using spark 0.14. I try to insert data into a hive table (in orc
format) from DF.
partitionedTestDF.write.format("org.apache.spark.sql.hive.orc.DefaultSource")
.mode(org.apache.spark.sql.SaveMode.Append).partitionBy("zone","z","year","month").saveAsTable("testorc")
When this job is s
Hi,
My pig on Tez (to store dataset into a partitioned hive table) throws
the following exception. What can be wrong? How can I fix it?
2015-06-09 10:59:57,268 ERROR [TezChild] runtime.PigProcessor:
Encountered exception while processing:
org.apache.pig.backend.executionengine.ExecException:
Hi,
My pig on Tez (to store dataset into a partitioned hive table) throws
the following exception. What can be wrong? How can I fix it?
2015-06-09 10:59:57,268 ERROR [TezChild] runtime.PigProcessor:
Encountered exception while processing:
org.apache.pig.backend.executionengine.ExecException:
Hi,
I tried to cast relation (one row) to scala. It works well when the cast
field is Integer. But if the cast field is FLOAT, i got
ClassCastException: java.lang.Integer cannot be cast to java.lang.String.
coordinate_cossin_xy = FOREACH join_coordinate_cossin_xy GENERATE
coordinate_xy::xlo
Hi,
I am new to pig. First I queried a hive table (x = LOAD 'x' USING
org.apache.hive.hcatalog.pig.HCatLoader();) and got a single
record/value. How can I used this single value to filter in another
query? I hope to get a better performance by filter as soon as possible.
BR,
Patcharee
Hi,
I ran a pig script on tez and got the EOFException. Check at
http://wiki.apache.org/hadoop/EOFException I have no ideas at all how I
can fix it. However I did not get the exception when I executed this pig
script on MR.
I am using HadoopVersion: 2.6.0.2.2.4.2-2, PigVersion: 0.14.0.2.2.4.
Hi,
How can I create a pipeline (containing a sequence of pig scripts)?
BR,
Patcharee
17 matches
Mail list logo