sparkR 3rd library

2017-09-04 Thread patcharee
"rbga" at org.apache.spark.api.r.RRunner.compute(RRunner.scala:108) at org.apache.spark.api.r.BaseRRDD.compute(RRDD.scala:51) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala Any ideas/suggestions? BR,

simple application on tez + llap

2017-02-24 Thread Patcharee Thongtra
Hi, I found an example of simple applications like wordcount running on tez - https://github.com/apache/tez/tree/master/tez-examples/src/main/java/org/apache/tez/examples. However, how to run this on tez+llap? Any suggestions? BR, Patcharee

Re: import sql file

2016-11-23 Thread patcharee
I exported sql table into .sql file and would like to import this into hive Best, Patcharee On 23. nov. 2016 10:40, Markovitz, Dudu wrote: Hi Patcharee The question is not clear. Dudu -Original Message- From: patcharee [mailto:patcharee.thong...@uni.no] Sent: Wednesday, November 23

import sql file

2016-11-23 Thread patcharee
Hi, How can I import .sql file into hive? Best, Patcharee

Re: hiveserver2 java heap space

2016-10-24 Thread Patcharee Thongtra
It works on Hive cli Patcharee On 10/24/2016 11:51 AM, Mich Talebzadeh wrote: does this work ok through Hive cli? Dr Mich Talebzadeh LinkedIn /https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw/ http://talebzadehmich.wordpress.com *Disclaimer:* Use

hiveserver2 java heap space

2016-10-24 Thread Patcharee Thongtra
nder why I got this error because I query just ONE line. Any ideas? Thanks, Patcharee

hiveserver2 GC overhead limit exceeded

2016-10-23 Thread patcharee
rom org.apache.hadoop.hive.ql.exec.DDLTask. GC overhead limit exceeded (state=08S01,code=1) How to solve this? How to identify if this error is from the client (beeline) or from hiveserver2? Thanks, Patcharee

Re: Spark DataFrame Plotting

2016-09-08 Thread patcharee
Hi Moon, When I generate an extra column (schema will be Index:Int, A:Double, B:Double), what sql command to generate a graph with 2 lines (Index as a X-axis, BOTH A and B as Y-axis)? Do I need to group by? Thanks! Patcharee On 07. sep. 2016 16:58, moon soo Lee wrote: You will need

Re: Spark DataFrame Plotting

2016-09-07 Thread patcharee
Normal select * gives me one column on X-axis and another on Y-axis. I cannot make both A:Double, B:Double displayed on Y-axis. How to do that? Patcharee On 07. sep. 2016 11:05, Abhisar Mohapatra wrote: You can do a normal select * on the dataframe and it would be automatically interpreted

Spark DataFrame Plotting

2016-09-07 Thread patcharee
Hi, I have a dataframe with this schema A:Double, B:Double. How can I plot this dataframe as two lines (comparing A and B at each step)? Best, Patcharee

what contribute to Task Deserialization Time

2016-07-21 Thread patcharee
! Patcharee - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Failed to stream on Yarn cluster

2016-04-28 Thread patcharee
is the taskmanager.out and how can I change it? Best, Patcharee On 28. april 2016 13:18, Maximilian Michels wrote: Hi Patcharee, What do you mean by "nothing happened"? There is no output? Did you check the logs? Cheers, Max On Thu, Apr 28, 2016 at 12:10 PM, patcharee <pa

Failed to stream on Yarn cluster

2016-04-28 Thread patcharee
happened. Any ideas? I tested the word count example from hdfs file on Yarn cluster and it worked fine. Best, Patcharee

Re: pyspark split pair rdd to multiple

2016-04-20 Thread patcharee
I can also use dataframe. Any suggestions? Best, Patcharee On 20. april 2016 10:43, Gourav Sengupta wrote: Is there any reason why you are not using data frames? Regards, Gourav On Tue, Apr 19, 2016 at 8:51 PM, pth001 <patcharee.thong...@uni.no <mailto:patcharee.thong...@uni.no&g

Re: build r-intepreter

2016-04-14 Thread Patcharee Thongtra
Yes, I did not install R. Stupid me. Thanks for your guide! BR, Patcharee On 04/13/2016 08:23 PM, Eric Charles wrote: Can you post the full stacktrace you have (look also at the log file)? Did you install R on your machine? SPARK_HOME is optional. On 13/04/16 15:39, Patcharee Thongtra wrote

executor running time vs getting result from jupyter notebook

2016-04-14 Thread Patcharee Thongtra
be the factor of time spending on these steps? BR, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: build r-intepreter

2016-04-13 Thread Patcharee Thongtra
spark for testing first. BR, Patcharee On 04/13/2016 02:52 PM, Patcharee Thongtra wrote: Hi, I have been struggling with R interpreter / SparkR interpreter. Is below the right command to build zeppelin with R interpreter / SparkR interpreter? mvn clean package -Pspark-1.6 -Phadoop-2.6

ExclamationTopology workers executors vs tasks

2016-03-01 Thread patcharee
Also from the Storm UI, the Num executors and Num tasks of the Spout word and the Bolt exclaim1 and exclaim2 are 10, 3 and 2 respectively (as same as defined in the code). Thanks, Patcharee

kafka streaming topic partitions vs executors

2016-02-26 Thread patcharee
as the topic's partitions). However some executors are given more than 1 tasks and work on these tasks sequentially. Why Spark does not distribute these 10 tasks to 10 executors? How to do that? Thanks, Patcharee

Re: streaming textFileStream problem - got only ONE line

2016-01-29 Thread patcharee
I moved them every interval to the monitored directory. Patcharee On 25. jan. 2016 22:30, Shixiong(Ryan) Zhu wrote: Did you move the file into "hdfs://helmhdfs/user/patcharee/cerdata/", or write into it directly? `textFileStream` requires that files must be written to the monitored

Pyspark filter not empty

2016-01-29 Thread patcharee
Hi, In pyspark how to filter if a column of dataframe is not empty? I tried: dfNotEmpty = df.filter(df['msg']!='') It did not work. Thanks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org

spark streaming input rate strange

2016-01-22 Thread patcharee
raises up to 10,000, stays at 10,000 a while and drops to about 7000-8000. - When clients = 20,000 the event rate raises up to 20,000, stays at 20,000 a while and drops to about 15000-17000. The same pattern Processing time is just about 400 ms. Any ideas/suggestions? Thanks, Patcharee

visualize data from spark streaming

2016-01-20 Thread patcharee
Hi, How to visualize realtime data (in graph/chart) from spark streaming? Any tools? Best, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

bad performance on PySpark - big text file

2015-12-08 Thread patcharee
the log of these two input splits (check python.PythonRunner: Times: total ... ) 15/12/08 07:37:15 INFO rdd.NewHadoopRDD: Input split: hdfs://helmhdfs/user/patcharee/ntap-raw-20151015-20151126/html2/budisansblog.blogspot.com.html:39728447488+134217728 15/12/08 08:49:30 INFO python.PythonRunner

Spark UI - Streaming Tab

2015-12-04 Thread patcharee
need to configure the history UI somehow to get such interface? Thanks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Spark applications metrics

2015-12-04 Thread patcharee
Hi How can I see the summary of data read / write, shuffle read / write, etc of an Application, not per stage? Thanks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail

Re: Spark UI - Streaming Tab

2015-12-04 Thread patcharee
I ran streaming jobs, but no streaming tab appeared for those jobs. Patcharee On 04. des. 2015 18:12, PhuDuc Nguyen wrote: I believe the "Streaming" tab is dynamic - it appears once you have a streaming job running, not when the cluster is simply up. It does not depend on 1.6 an

Re: Spark Streaming - History UI

2015-12-02 Thread patcharee
I meant there is no streaming tab at all. It looks like I need version 1.6 Patcharee On 02. des. 2015 11:34, Steve Loughran wrote: The history UI doesn't update itself for live apps (SPARK-7889) -though I'm working on it Are you trying to view a running streaming job? On 2 Dec 2015, at 05

Spark Streaming - History UI

2015-12-01 Thread patcharee
Hi, On my history server UI, I cannot see "streaming" tab for any streaming jobs? I am using version 1.5.1. Any ideas? Thanks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional c

custom inputformat recordreader

2015-11-26 Thread Patcharee Thongtra
Hi, In python how to use inputformat/custom recordreader? Thanks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

data local read counter

2015-11-25 Thread Patcharee Thongtra
Hi, Is there a counter for data local read? I understood that it is locality level counter, but it seems not. Thanks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail

Re: query orc file by hive

2015-11-13 Thread patcharee
Hi, It work with non-partition ORC, but does not work with (2-column) partitioned ORC. Thanks, Patcharee On 09. nov. 2015 10:55, Elliot West wrote: Hi, You can create a table and point the location property to the folder containing your ORC file: CREATE EXTERNAL TABLE orc_table

Re: query orc file by hive

2015-11-13 Thread patcharee
Hi, It works after I altered add partition. Thanks! My partitioned orc file (directory) is created by Spark, therefore hive is not aware of the partitions automatically. Best, Patcharee On 13. nov. 2015 13:08, Elliot West wrote: Have you added the partitions to the meta store? ALTER TABLE

query orc file by hive

2015-11-09 Thread patcharee
Hi, How can I query an orc file (*.orc) by Hive? This orc file is created by other apps, like spark, mr. Thanks, Patcharee

[jira] [Issue Comment Deleted] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-11-06 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] patcharee updated SPARK-11087: -- Comment: was deleted (was: I found a scenario where the problem exists

[jira] [Issue Comment Deleted] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-11-06 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] patcharee updated SPARK-11087: -- Comment: was deleted (was: Hi [~zzhan], the problem actually happens when I generates orc file

[jira] [Reopened] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-11-06 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] patcharee reopened SPARK-11087: --- I found a scenario where the problem exists > spark.sql.orc.filterPushdown does not work, No

[jira] [Commented] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-11-06 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993398#comment-14993398 ] patcharee commented on SPARK-11087: --- Hi [~zzhan], the problem actually happens when I generates orc

[jira] [Commented] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-11-06 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993771#comment-14993771 ] patcharee commented on SPARK-11087: --- Hi, I found a scenario where the predicate does not work again

[jira] [Comment Edited] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-11-06 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993771#comment-14993771 ] patcharee edited comment on SPARK-11087 at 11/6/15 2:56 PM: Hi, I found

How to run parallel on each DataFrame group

2015-11-05 Thread patcharee
roblem is each group after filtered is handled by an executor one by one. How to change the code to allow each group run in parallel? I looked at groupBy, but seem only for aggregation. Thanks, Patcharee

execute native system commands in Spark

2015-11-02 Thread patcharee
Hi, Is it possible to execute native system commands (in parallel) Spark, like scala.sys.process ? Best, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h

Min-Max Index vs Bloom filter

2015-11-02 Thread patcharee
Hi, For the orc format, which scenario that bloom filter is better than min-max index? Best, Patcharee

[jira] [Closed] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-23 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] patcharee closed SPARK-11087. - Resolution: Not A Problem The predicate is indeed generated and can be found in the executor log

[jira] [Commented] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-23 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970786#comment-14970786 ] patcharee commented on SPARK-11087: --- [~zzhan] I found the predicate generated in the executor log

the column names removed after insert select

2015-10-23 Thread patcharee
while it is supposed to be - Type: struct<date:int,hh:int,x:int,y:int> Any ideas how this happened and how I can fix it. Please suggest me. BR, Patcharee

the number of files after merging

2015-10-22 Thread patcharee
the whole table, not one-by-one partition? Thanks, Patcharee

[jira] [Commented] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-21 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967661#comment-14967661 ] patcharee commented on SPARK-11087: --- Hi [~zzhan] What version of hive and orc file you are using? Can

[jira] [Comment Edited] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-18 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14960296#comment-14960296 ] patcharee edited comment on SPARK-11087 at 10/19/15 3:34 AM: - [~zzhan] Below

[jira] [Commented] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-16 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14960296#comment-14960296 ] patcharee commented on SPARK-11087: --- [~zhazhan] Below is my test. Please check. I tried to change

Re: sql query orc slow

2015-10-13 Thread Patcharee Thongtra
is not sorted / indexed - the split strategy hive.exec.orc.split.strategy BR, Patcharee On 10/09/2015 08:01 PM, Zhan Zhang wrote: That is weird. Unfortunately, there is no debug info available on this part. Can you please open a JIRA to add some debug information on the driver side? Thanks. Zhan

orc table with sorted field

2015-10-13 Thread Patcharee Thongtra
ddl page, it seems only bucket table can be sorted. Any suggestions please BR, Patcharee

Re: sql query orc slow

2015-10-13 Thread Patcharee Thongtra
Hi Zhan Zhang, Here is the issue https://issues.apache.org/jira/browse/SPARK-11087 BR, Patcharee On 10/13/2015 06:47 PM, Zhan Zhang wrote: Hi Patcharee, I am not sure which side is wrong, driver or executor. If it is executor side, the reason you mentioned may be possible

[jira] [Created] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-13 Thread patcharee (JIRA)
patcharee created SPARK-11087: - Summary: spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate Key: SPARK-11087 URL: https://issues.apache.org/jira/browse/SPARK-11087 Project: Spark

Re: sql query orc slow

2015-10-09 Thread patcharee
Yes, the predicate pushdown is enabled, but still take longer time than the first method BR, Patcharee On 08. okt. 2015 18:43, Zhan Zhang wrote: Hi Patcharee, Did you enable the predicate pushdown in the second method? Thanks. Zhan Zhang On Oct 8, 2015, at 1:43 AM, patcharee

Re: sql query orc slow

2015-10-09 Thread patcharee
I set hiveContext.setConf("spark.sql.orc.filterPushdown", "true"). But from the log No ORC pushdown predicate for my query with WHERE clause. 15/10/09 19:16:01 DEBUG OrcInputFormat: No ORC pushdown predicate I did not understand what wrong with this. BR, Patcharee On

Re: sql query orc slow

2015-10-09 Thread patcharee
this time in the log pushdown predicate was generated but results was wrong (no results at all) 15/10/09 18:36:06 INFO OrcInputFormat: ORC pushdown predicate: leaf-0 = (EQUALS x 320) expr = leaf-0 Any ideas What wrong with this? Why the ORC pushdown predicate is not applied by the system? BR

hiveContext sql number of tasks

2015-10-07 Thread patcharee
to force the spark sql to use less tasks? BR, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Idle time between jobs

2015-09-16 Thread patcharee
.scala:143 15/09/16 11:21:08 INFO DAGScheduler: Got job 2 (saveAsTextFile at GenerateHistogram.scala:143) with 1 output partitions 15/09/16 11:21:08 INFO DAGScheduler: Final stage: ResultStage 2(saveAsTextFile at GenerateHistogram.scala:143) BR,

spark performance - executor computing time

2015-09-15 Thread patcharee
ize and low gc time as others. What can impact the executor computing time? Any suggestions what parameters I should monitor/configure? BR, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additio

spark 1.5 sort slow

2015-09-01 Thread patcharee
y configuration explicitly? Any suggestions? BR, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

embedded pig in the custer

2015-07-22 Thread patcharee
. BR, Patcharee

Re: character '' not supported here

2015-07-20 Thread patcharee
data, like select count(*) from Table, any more, just got error line 1:1 character '' not supported here, no matter Tez or MR engine. How can you solve the problem in your case? BR, Patcharee On 18. juli 2015 21:26, Nitin Pawar wrote: can you tell exactly what steps you did/? also did you

Re: character '' not supported here

2015-07-20 Thread patcharee
data, like select count(*) from Table, any more, just got error line 1:1 character '' not supported here, no matter Tez or MR engine. How can you solve the problem in your case? BR, Patcharee On 18. juli 2015 21:26, Nitin Pawar wrote: can you tell exactly what steps you did/? also did you

character '' not supported here

2015-07-18 Thread patcharee
character '' not supported here line 1:141 character '' not supported here line 1:142 character '' not supported here line 1:143 character '' not supported here line 1:144 character '' not supported here line 1:145 character '' not supported here line 1:146 character '' not supported here BR, Patcharee

character '' not supported here

2015-07-18 Thread patcharee
character '' not supported here line 1:141 character '' not supported here line 1:142 character '' not supported here line 1:143 character '' not supported here line 1:144 character '' not supported here line 1:145 character '' not supported here line 1:146 character '' not supported here BR, Patcharee

Re: character '' not supported here

2015-07-18 Thread patcharee
This select * from table limit 5; works, but not others. So? Patcharee On 18. juli 2015 12:08, Nitin Pawar wrote: can you do select * from table limit 5; On Sat, Jul 18, 2015 at 3:35 PM, patcharee patcharee.thong...@uni.no mailto:patcharee.thong...@uni.no wrote: Hi, I am using

Re: character '' not supported here

2015-07-18 Thread patcharee
This select * from table limit 5; works, but not others. So? Patcharee On 18. juli 2015 12:08, Nitin Pawar wrote: can you do select * from table limit 5; On Sat, Jul 18, 2015 at 3:35 PM, patcharee patcharee.thong...@uni.no mailto:patcharee.thong...@uni.no wrote: Hi, I am using

Re: fails to alter table concatenate

2015-06-30 Thread patcharee
Actually it works on mr. So the problem is from tez. thanks! BR, Patcharee On 30. juni 2015 10:23, Nitin Pawar wrote: can you try doing same by changing the query engine from tez to mr1? not sure if its hive bug or tez bug On Tue, Jun 30, 2015 at 1:46 PM, patcharee patcharee.thong...@uni.no

fails to alter table concatenate

2015-06-30 Thread patcharee
) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ] DAG failed due to vertex failure. failedVertices:1 killedVertices:0 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.DDLTask BR, Patcharee

Re: Kryo serialization of classes in additional jars

2015-06-26 Thread patcharee
Hi, I am having this problem on spark 1.4. Do you have any ideas how to solve it? I tried to use spark.executor.extraClassPath, but it did not help BR, Patcharee On 04. mai 2015 23:47, Imran Rashid wrote: Oh, this seems like a real pain. You should file a jira, I didn't see an open issue

Re: HiveContext saveAsTable create wrong partition

2015-06-16 Thread patcharee
I found if I move the partitioned columns in schemaString and in Row to the end of the sequence, then it works correctly... On 16. juni 2015 11:14, patcharee wrote: Hi, I am using spark 1.4 and HiveContext to append data into a partitioned hive table. I found that the data insert

HiveContext saveAsTable create wrong partition

2015-06-16 Thread patcharee
23 columns (longer than Tuple maximum length), so I use Row Object to store raw data, not Tuple. Here is some message from spark when it saved data 15/06/16 10:39:22 INFO metadata.Hive: Renaming src:hdfs://service-10-0.local:8020/tmp/hive-patcharee/hive_2015-06-16_10-39-21_205_8768669104487548472

sql.catalyst.ScalaReflection scala.reflect.internal.MissingRequirementError

2015-06-15 Thread patcharee
) at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:28) at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:410) at org.apache.spark.sql.SQLContext$implicits$.rddToDataFrameHolder(SQLContext.scala:335) BR, Patcharee

Re: hiveContext.sql NullPointerException

2015-06-11 Thread patcharee
from hive I got nothing. How can I fix this? Any suggestions please BR, Patcharee On 07. juni 2015 16:40, Cheng Lian wrote: Spark SQL supports Hive dynamic partitioning, so one possible workaround is to create a Hive table partitioned by zone, z, year, and month dynamically, and then insert

Re: hiveContext.sql NullPointerException

2015-06-08 Thread patcharee
Hi, Thanks for your guidelines. I will try it out. Btw how do you know HiveContext.sql (and also DataFrame.registerTempTable) is only expected to be invoked on driver side? Where can I find document? BR, Patcharee On 07. juni 2015 16:40, Cheng Lian wrote: Spark SQL supports Hive dynamic

Re: hiveContext.sql NullPointerException

2015-06-07 Thread patcharee
Hi, How can I expect to work on HiveContext on the executor? If only the driver can see HiveContext, does it mean I have to collect all datasets (very large) to the driver and use HiveContext there? It will be memory overload on the driver and fail. BR, Patcharee On 07. juni 2015 11:51

write multiple outputs by key

2015-06-06 Thread patcharee
combination) gets datasets. How can I fix this problem? Any suggestions are appreciated. BR, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

hiveContext.sql NullPointerException

2015-06-06 Thread patcharee
Hi, I try to insert data into a partitioned hive table. The groupByKey is to combine dataset into a partition of the hive table. After the groupByKey, I converted the iterable[X] to DB by X.toList.toDF(). But the hiveContext.sql throws NullPointerException, see below. Any suggestions? What

Re: FetchFailed Exception

2015-06-05 Thread patcharee
Hi, I has this problem before, and in my case it is because the executor/container was killed by yarn when it used more memory than allocated. You can check if your case is the same by checking yarn node manager log. Best, Patcharee On 05. juni 2015 07:25, ÐΞ€ρ@Ҝ (๏̯͡๏) wrote: I see

NullPointerException SQLConf.setConf

2015-06-04 Thread patcharee
) at java.lang.Thread.run(Thread.java:744) Best, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

MetaException(message:java.security.AccessControlException: Permission denied

2015-06-03 Thread patcharee
(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.alterPartition(Hive.java:469) ... 26 more BR, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h

MetaException(message:java.security.AccessControlException: Permission denied

2015-06-03 Thread patcharee
(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.alterPartition(Hive.java:469) ... 26 more BR, Patcharee

Re: ERROR cluster.YarnScheduler: Lost executor

2015-06-03 Thread patcharee
1.3.1, is the problem from the https://issues.apache.org/jira/browse/SPARK-4516? Best, Patcharee On 03. juni 2015 10:11, Akhil Das wrote: Which version of spark? Looks like you are hitting this one https://issues.apache.org/jira/browse/SPARK-4516 Thanks Best Regards On Wed, Jun 3, 2015 at 1

Re: ERROR cluster.YarnScheduler: Lost executor

2015-06-03 Thread patcharee
, chunkIndex=1}, buffer=FileSegmentManagedBuffer{file=/hdisk3/hadoop/yarn/local/usercache/patcharee/appcache/application_1432633634512_0213/blockmgr-12d59e6b-0895-4a0e-9d06-152d2f7ee855/09/shuffle_0_56_0.data, offset=896, length=1132499356}} to /10.10.255.238:35430; closing connection

ERROR cluster.YarnScheduler: Lost executor

2015-06-03 Thread patcharee
Hi, What can be the cause of this ERROR cluster.YarnScheduler: Lost executor? How can I fix it? Best, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h

Insert overwrite to hive - ArrayIndexOutOfBoundsException

2015-06-02 Thread patcharee
) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Best, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h

pig performance on reading/filtering orc file

2015-05-29 Thread patcharee
'replicated'; Best, Patcharee

cast column float

2015-05-27 Thread patcharee
records matched the condition. What can be wrong? I am using Hive 0.14 BR, Patcharee

EOFException - TezJob - Cannot submit DAG

2015-05-22 Thread patcharee
Hi, I ran a pig script on tez and got the EOFException. Check at http://wiki.apache.org/hadoop/EOFException I have no ideas at all how I can fix it. However I did not get the exception when I executed this pig script on MR. I am using HadoopVersion: 2.6.0.2.2.4.2-2, PigVersion:

saveasorcfile on partitioned orc

2015-05-20 Thread patcharee
Hi, I followed the information on https://www.mail-archive.com/reviews@spark.apache.org/msg141113.html to save orc file with spark 1.2.1. I can save data to a new orc file. I wonder how to save data to an existing and partitioned orc file? Any suggestions? BR, Patcharee

Re: conflict from apache commons codec

2015-05-20 Thread patcharee
be wrong for the latter? BR, Patcharee On 20. mai 2015 09:37, Siddharth Seth wrote: My best guess would be that an older version of commons-codec is also on the classpath for the running task. If you have access to the local-dirs configured under YARN - you could find the application dir

hive on Tez - merging orc files

2015-04-24 Thread patcharee
did not. I would appreciate any suggestions. BR, Patcharee

Re: merge small orc files

2015-04-21 Thread patcharee
: 'hdfs://service-test-1-0.testlocal:8020/apps/hive/warehouse/orc_merge5a/st=0.8/00_0' to trash at: hdfs://service-test-1-0.testlocal:8020/user/patcharee/.Trash/Current Moved: 'hdfs://service-test-1-0.testlocal:8020/apps/hive/warehouse/orc_merge5a/st=0.8/02_0' to trash at: hdfs

Re: merge small orc files

2015-04-21 Thread patcharee
which could be the cause of the problem. Please let me know how to fix it. BR, Patcharee On 21. april 2015 13:10, Gopal Vijayaraghavan wrote: alter table table concatenate do not work? I have a dynamic partitioned table (stored as orc). I tried to alter concatenate, but it did not work. See

merge small orc files

2015-04-20 Thread patcharee
29075 2015-04-20 15:23 /apps/hive/warehouse/coordinate/zone=2/part-r-2 Any ideas? BR, Patcharee

override log4j.properties

2015-04-09 Thread patcharee
Hello, How to override log4j.properties for a specific spark job? BR, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: Spark Job History Server

2015-03-18 Thread patcharee
) at java.lang.Class.forName(Class.java:191) at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:183) at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala) Patcharee On 18. mars 2015 11:35, Akhil Das wrote: You can simply turn

Spark Job History Server

2015-03-18 Thread patcharee
spark.yarn.historyServer.address sandbox.hortonworks.com:19888 But got Exception in thread main java.lang.ClassNotFoundException: org.apache.spark.deploy.yarn.history.YarnHistoryProvider What class is really needed? How to fix it? Br, Patcharee

Re: Spark Job History Server

2015-03-18 Thread patcharee
Hi, My spark was compiled with yarn profile, I can run spark on yarn without problem. For the spark job history server problem, I checked spark-assembly-1.3.0-hadoop2.4.0.jar and found that the package org.apache.spark.deploy.yarn.history is missing. I don't know why BR, Patcharee

  1   2   >