Re: Can't access remote Hive table from spark

2015-02-12 Thread Zhan Zhang
When you log in, you have root access. Then you can do “su hdfs” or any other 
account. Then you can create hdfs directory and change permission, etc.


Thanks

Zhan Zhang

On Feb 11, 2015, at 11:28 PM, guxiaobo1982 
mailto:guxiaobo1...@qq.com>> wrote:

Hi Zhan,

Yes, I found there is a hdfs account, which is created by Ambari, but what's 
the password for this account, how can I login under this account?
Can I just change the password for the hdfs account?

Regards,



-- Original --
From:  "Zhan Zhang";mailto:zzh...@hortonworks.com>>;
Send time: Thursday, Feb 12, 2015 2:00 AM
To: ""mailto:guxiaobo1...@qq.com>>;
Cc: 
"user@spark.apache.org<mailto:user@spark.apache.org>"mailto:user@spark.apache.org>>;
 "Cheng Lian"mailto:lian.cs@gmail.com>>;
Subject:  Re: Can't access remote Hive table from spark

You need to have right hdfs account, e.g., hdfs,  to create directory and 
assign permission.

Thanks.

Zhan Zhang
On Feb 11, 2015, at 4:34 AM, guxiaobo1982 
mailto:guxiaobo1...@qq.com>> wrote:

Hi Zhan,
My Single Node Cluster of Hadoop is installed by Ambari 1.7.0, I tried to 
create the /user/xiaobogu directory in hdfs, but both failed with user xiaobogu 
and root

[xiaobogu@lix1 current]$ hadoop dfs -mkdir /user/xiaobogu
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

mkdir: Permission denied: user=xiaobogu, access=WRITE, 
inode="/user":hdfs:hdfs:drwxr-xr-x

root@lix1 bin]# hadoop dfs -mkdir /user/xiaobogu
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.


mkdir: Permission denied: user=root, access=WRITE, 
inode="/user":hdfs:hdfs:drwxr-xr-x

I notice there is a hdfs account created by ambari, but what's password for it, 
should I user the hdfs account to create the directory?



-- Original --
From:  "Zhan Zhang";mailto:zzh...@hortonworks.com>>;
Send time: Sunday, Feb 8, 2015 4:11 AM
To: ""mailto:guxiaobo1...@qq.com>>;
Cc: 
"user@spark.apache.org<mailto:user@spark.apache.org>"mailto:user@spark.apache.org>>;
 "Cheng Lian"mailto:lian.cs@gmail.com>>;
Subject:  Re: Can't access remote Hive table from spark

Yes. You need to create xiaobogu under /user and provide right permission to 
xiaobogu.

Thanks.

Zhan Zhang

On Feb 7, 2015, at 8:15 AM, guxiaobo1982 
mailto:guxiaobo1...@qq.com>> wrote:

Hi Zhan Zhang,

With the pre-bulit version 1.2.0 of spark against the yarn cluster installed by 
ambari 1.7.0, I come with the following errors:

[xiaobogu@lix1 spark]$ ./bin/spark-submit --class 
org.apache.spark.examples.SparkPi--master yarn-cluster  --num-executors 3 
--driver-memory 512m  --executor-memory 512m   --executor-cores 1  
lib/spark-examples*.jar 10


Spark assembly has been built with Hive, including Datanucleus jars on classpath

15/02/08 00:11:53 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable

15/02/08 00:11:54 INFO client.RMProxy: Connecting to ResourceManager at 
lix1.bh.com/192.168.100.3:8050<http://lix1.bh.com/192.168.100.3:8050>

15/02/08 00:11:56 INFO yarn.Client: Requesting a new application from cluster 
with 1 NodeManagers

15/02/08 00:11:57 INFO yarn.Client: Verifying our application has not requested 
more than the maximum memory capability of the cluster (4096 MB per container)

15/02/08 00:11:57 INFO yarn.Client: Will allocate AM container, with 896 MB 
memory including 384 MB overhead

15/02/08 00:11:57 INFO yarn.Client: Setting up container launch context for our 
AM

15/02/08 00:11:57 INFO yarn.Client: Preparing resources for our AM container

15/02/08 00:11:58 WARN hdfs.BlockReaderLocal: The short-circuit local reads 
feature cannot be used because libhadoop cannot be loaded.

Exception in thread "main" org.apache.hadoop.security.AccessControlException: 
Permission denied: user=xiaobogu, access=WRITE, 
inode="/user":hdfs:hdfs:drwxr-xr-x

at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271)

at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:257)

at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:238)

at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:179)

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6515)

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6497)

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6449)

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesyst

Re: Can't access remote Hive table from spark

2015-02-11 Thread guxiaobo1982
Hi Zhan,


Yes, I found there is a hdfs account, which is created by Ambari, but what's 
the password for this account, how can I login under this account?
Can I just change the password for the hdfs account?


Regards,






-- Original --
From:  "Zhan Zhang";;
Send time: Thursday, Feb 12, 2015 2:00 AM
To: ""; 
Cc: "user@spark.apache.org"; "Cheng 
Lian"; 
Subject:  Re: Can't access remote Hive table from spark



 You need to have right hdfs account, e.g., hdfs,  to create directory and 
assign permission. 
 
 Thanks.
 
 
 Zhan Zhang
  On Feb 11, 2015, at 4:34 AM, guxiaobo1982  wrote:
 
  Hi Zhan,
 My Single Node Cluster of Hadoop is installed by Ambari 1.7.0, I tried to 
create the /user/xiaobogu directory in hdfs, but both failed with user xiaobogu 
and root
 
 
   [xiaobogu@lix1 current]$ hadoop dfs -mkdir /user/xiaobogu
  DEPRECATED: Use of this script to execute hdfs command is deprecated.
  Instead use the hdfs command for it.
  
 
  mkdir: Permission denied: user=xiaobogu, access=WRITE, 
inode="/user":hdfs:hdfs:drwxr-xr-x
  
 
  root@lix1 bin]# hadoop dfs -mkdir /user/xiaobogu
  DEPRECATED: Use of this script to execute hdfs command is deprecated.
  Instead use the hdfs command for it.
  
 
 
 
  mkdir: Permission denied: user=root, access=WRITE, 
inode="/user":hdfs:hdfs:drwxr-xr-x
  
 
  I notice there is a hdfs account created by ambari, but what's password for 
it, should I user the hdfs account to create the directory?
  
 
 
  
 
 
 
 -- Original --
  From:  "Zhan Zhang";;
 Send time: Sunday, Feb 8, 2015 4:11 AM
 To: ""; 
 Cc: "user@spark.apache.org"; "Cheng 
Lian"; 
 Subject:  Re: Can't access remote Hive table from spark
 
 
 
 Yes. You need to create xiaobogu under /user and provide right permission to 
xiaobogu. 
 
 Thanks.
 
 
 Zhan Zhang
 
  On Feb 7, 2015, at 8:15 AM, guxiaobo1982  wrote:
 
  Hi Zhan Zhang,
 
 
 With the pre-bulit version 1.2.0 of spark against the yarn cluster installed 
by ambari 1.7.0, I come with the following errors:
  
[xiaobogu@lix1 spark]$ ./bin/spark-submit --class 
org.apache.spark.examples.SparkPi--master yarn-cluster  --num-executors 3 
--driver-memory 512m  --executor-memory 512m   --executor-cores 1  
lib/spark-examples*.jar 10
 

 
 
Spark assembly has been built with Hive, including Datanucleus jars on classpath
 
15/02/08 00:11:53 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
 
15/02/08 00:11:54 INFO client.RMProxy: Connecting to ResourceManager at 
lix1.bh.com/192.168.100.3:8050
 
15/02/08 00:11:56 INFO yarn.Client: Requesting a new application from cluster 
with 1 NodeManagers
 
15/02/08 00:11:57 INFO yarn.Client: Verifying our application has not requested 
more than the maximum memory capability of the cluster (4096 MB per container)
 
15/02/08 00:11:57 INFO yarn.Client: Will allocate AM container, with 896 MB 
memory including 384 MB overhead
 
15/02/08 00:11:57 INFO yarn.Client: Setting up container launch context for our 
AM
 
15/02/08 00:11:57 INFO yarn.Client: Preparing resources for our AM container
 
15/02/08 00:11:58 WARN hdfs.BlockReaderLocal: The short-circuit local reads 
feature cannot be used because libhadoop cannot be loaded.
 
Exception in thread "main" org.apache.hadoop.security.AccessControlException: 
Permission denied: user=xiaobogu, access=WRITE, 
inode="/user":hdfs:hdfs:drwxr-xr-x
 
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:257)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:238)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:179)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6515)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6497)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6449)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4251)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4221)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4194)
 
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813)
 
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600)
 
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(Clie

Re: Can't access remote Hive table from spark

2015-02-11 Thread Zhan Zhang
You need to have right hdfs account, e.g., hdfs,  to create directory and 
assign permission.

Thanks.

Zhan Zhang
On Feb 11, 2015, at 4:34 AM, guxiaobo1982 
mailto:guxiaobo1...@qq.com>> wrote:

Hi Zhan,
My Single Node Cluster of Hadoop is installed by Ambari 1.7.0, I tried to 
create the /user/xiaobogu directory in hdfs, but both failed with user xiaobogu 
and root

[xiaobogu@lix1 current]$ hadoop dfs -mkdir /user/xiaobogu
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

mkdir: Permission denied: user=xiaobogu, access=WRITE, 
inode="/user":hdfs:hdfs:drwxr-xr-x

root@lix1 bin]# hadoop dfs -mkdir /user/xiaobogu
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.


mkdir: Permission denied: user=root, access=WRITE, 
inode="/user":hdfs:hdfs:drwxr-xr-x

I notice there is a hdfs account created by ambari, but what's password for it, 
should I user the hdfs account to create the directory?



-- Original --
From:  "Zhan Zhang";mailto:zzh...@hortonworks.com>>;
Send time: Sunday, Feb 8, 2015 4:11 AM
To: ""mailto:guxiaobo1...@qq.com>>;
Cc: 
"user@spark.apache.org<mailto:user@spark.apache.org>"mailto:user@spark.apache.org>>;
 "Cheng Lian"mailto:lian.cs@gmail.com>>;
Subject:  Re: Can't access remote Hive table from spark

Yes. You need to create xiaobogu under /user and provide right permission to 
xiaobogu.

Thanks.

Zhan Zhang

On Feb 7, 2015, at 8:15 AM, guxiaobo1982 
mailto:guxiaobo1...@qq.com>> wrote:

Hi Zhan Zhang,

With the pre-bulit version 1.2.0 of spark against the yarn cluster installed by 
ambari 1.7.0, I come with the following errors:

[xiaobogu@lix1 spark]$ ./bin/spark-submit --class 
org.apache.spark.examples.SparkPi--master yarn-cluster  --num-executors 3 
--driver-memory 512m  --executor-memory 512m   --executor-cores 1  
lib/spark-examples*.jar 10


Spark assembly has been built with Hive, including Datanucleus jars on classpath

15/02/08 00:11:53 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable

15/02/08 00:11:54 INFO client.RMProxy: Connecting to ResourceManager at 
lix1.bh.com/192.168.100.3:8050<http://lix1.bh.com/192.168.100.3:8050>

15/02/08 00:11:56 INFO yarn.Client: Requesting a new application from cluster 
with 1 NodeManagers

15/02/08 00:11:57 INFO yarn.Client: Verifying our application has not requested 
more than the maximum memory capability of the cluster (4096 MB per container)

15/02/08 00:11:57 INFO yarn.Client: Will allocate AM container, with 896 MB 
memory including 384 MB overhead

15/02/08 00:11:57 INFO yarn.Client: Setting up container launch context for our 
AM

15/02/08 00:11:57 INFO yarn.Client: Preparing resources for our AM container

15/02/08 00:11:58 WARN hdfs.BlockReaderLocal: The short-circuit local reads 
feature cannot be used because libhadoop cannot be loaded.

Exception in thread "main" org.apache.hadoop.security.AccessControlException: 
Permission denied: user=xiaobogu, access=WRITE, 
inode="/user":hdfs:hdfs:drwxr-xr-x

at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271)

at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:257)

at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:238)

at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:179)

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6515)

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6497)

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6449)

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4251)

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4221)

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4194)

at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813)

at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600)

at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)

at org.apache.hadoop.ipc.Server$Handler$1.run(S

Re: Can't access remote Hive table from spark

2015-02-11 Thread guxiaobo1982
Hi Zhan,
My Single Node Cluster of Hadoop is installed by Ambari 1.7.0, I tried to 
create the /user/xiaobogu directory in hdfs, but both failed with user xiaobogu 
and root



[xiaobogu@lix1 current]$ hadoop dfs -mkdir /user/xiaobogu
 
DEPRECATED: Use of this script to execute hdfs command is deprecated.
 
Instead use the hdfs command for it.
 


 
mkdir: Permission denied: user=xiaobogu, access=WRITE, 
inode="/user":hdfs:hdfs:drwxr-xr-x




root@lix1 bin]# hadoop dfs -mkdir /user/xiaobogu

DEPRECATED: Use of this script to execute hdfs command is deprecated.

Instead use the hdfs command for it.




 

mkdir: Permission denied: user=root, access=WRITE, 
inode="/user":hdfs:hdfs:drwxr-xr-x




I notice there is a hdfs account created by ambari, but what's password for it, 
should I user the hdfs account to create the directory?








-- Original --
From:  "Zhan Zhang";;
Send time: Sunday, Feb 8, 2015 4:11 AM
To: ""; 
Cc: "user@spark.apache.org"; "Cheng 
Lian"; 
Subject:  Re: Can't access remote Hive table from spark



 Yes. You need to create xiaobogu under /user and provide right permission to 
xiaobogu. 
 
 Thanks.
 
 
 Zhan Zhang
 
  On Feb 7, 2015, at 8:15 AM, guxiaobo1982  wrote:
 
  Hi Zhan Zhang,
 
 
 With the pre-bulit version 1.2.0 of spark against the yarn cluster installed 
by ambari 1.7.0, I come with the following errors:
  
[xiaobogu@lix1 spark]$ ./bin/spark-submit --class 
org.apache.spark.examples.SparkPi--master yarn-cluster  --num-executors 3 
--driver-memory 512m  --executor-memory 512m   --executor-cores 1  
lib/spark-examples*.jar 10
 

 
 
Spark assembly has been built with Hive, including Datanucleus jars on classpath
 
15/02/08 00:11:53 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
 
15/02/08 00:11:54 INFO client.RMProxy: Connecting to ResourceManager at 
lix1.bh.com/192.168.100.3:8050
 
15/02/08 00:11:56 INFO yarn.Client: Requesting a new application from cluster 
with 1 NodeManagers
 
15/02/08 00:11:57 INFO yarn.Client: Verifying our application has not requested 
more than the maximum memory capability of the cluster (4096 MB per container)
 
15/02/08 00:11:57 INFO yarn.Client: Will allocate AM container, with 896 MB 
memory including 384 MB overhead
 
15/02/08 00:11:57 INFO yarn.Client: Setting up container launch context for our 
AM
 
15/02/08 00:11:57 INFO yarn.Client: Preparing resources for our AM container
 
15/02/08 00:11:58 WARN hdfs.BlockReaderLocal: The short-circuit local reads 
feature cannot be used because libhadoop cannot be loaded.
 
Exception in thread "main" org.apache.hadoop.security.AccessControlException: 
Permission denied: user=xiaobogu, access=WRITE, 
inode="/user":hdfs:hdfs:drwxr-xr-x
 
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:257)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:238)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:179)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6515)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6497)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6449)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4251)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4221)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4194)
 
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813)
 
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600)
 
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
 
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
 
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
 
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
 
at java.security.AccessController.doPrivileged(Native Method)
 
at javax.security.auth.Subject.doAs(Subject.java:415)
 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
 

 
 
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 
at 
sun.ref

Re: Can't access remote Hive table from spark

2015-02-08 Thread guxiaobo1982
Hi Lian,
Will the latest 0.14.0 version of Hive,which is installed by ambari 1.7.0 by 
default, be supported by the next release of Spark?


Regards,




-- Original --
From:  "Cheng Lian";;
Send time: Friday, Feb 6, 2015 9:02 AM
To: ""; "user@spark.apache.org"; 

Subject:  Re: Can't access remote Hive table from spark



  
Please note that Spark 1.2.0 only support Hive 0.13.1 or 0.12.0,
 none of other versions are supported.
   
Best,
 Cheng
   
On 1/25/15 12:18 AM, guxiaobo1982 wrote:
   



Hi,
   I built and started a single node standalone Spark 1.2.0 
cluster along with a single node Hive 0.14.0 instance installed by 
Ambari 1.17.0. On the Spark and Hive node I can create and query 
tables inside Hive, and on remote machines I can submit the SparkPi 
example   to the Spark master. But I failed to run the following
   example code :
   
 

public class SparkTest {
 
 public static   void main(String[] args)
 
 {
 
  String appName= "This   is a test application";
 
  String master="spark://lix1.bh.com:7077";
 
  
 
  SparkConf conf = new   
SparkConf().setAppName(appName).setMaster(master);
 
  JavaSparkContext sc = new   JavaSparkContext(conf);
 
  
 
  JavaHiveContext sqlCtx = new   
org.apache.spark.sql.hive.api.java.JavaHiveContext(sc);
 
  //sqlCtx.sql("CREATE   TABLE IF NOT EXISTS src 
(key INT,   value STRING)");
 
  //sqlCtx.sql("LOAD   DATA LOCAL INPATH 
'/opt/spark/examples/src/main/resources/kv1.txt' INTO TABLE src");
 
  //   Queries are expressed in HiveQL.
 
List rows = sqlCtx.sql("FROM src SELECT key, value").collect();
 
System.out.print("I got " + rows.size() + " rows \r\n");
 
  sc.close();}
 
}
 

 
 
Exception in thread "main" 
org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found src
 
 at   
org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:980)
 
 at   
org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950)
 
 at   
org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:70)
 
 at 
org.apache.spark.sql.hive.HiveContext$anon$2.org$apache$spark$sql$catalyst$analysis$OverrideCatalog$super$lookupRelation(HiveContext.scala:253)
 
 at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$anonfun$lookupRelation$3.apply(Catalog.scala:141)
 
 at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$anonfun$lookupRelation$3.apply(Catalog.scala:141)
 
 at   scala.Option.getOrElse(Option.scala:120)
 
 at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:141)
 
 at   
org.apache.spark.sql.hive.HiveContext$anon$2.lookupRelation(HiveContext.scala:253)
 
 at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$anonfun$apply$5.applyOrElse(Analyzer.scala:143)
 
 at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$anonfun$apply$5.applyOrElse(Analyzer.scala:138)
 
 at   
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144)
 
 at   
org.apache.spark.sql.catalyst.trees.TreeNode$anonfun$4.apply(TreeNode.scala:162)
 
 at   scala.collection.Iterator$anon$11.next(Iterator.scala:328)
 
 at   scala.collection.Iterator$class.foreach(Iterator.scala:727)
 
 at   scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
 
 at   
scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
 
 at   
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
 
 at   
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
 
 at   
scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
 
 at   scala.collection.AbstractIterator.to(Iterator.scala:1157)
 
 at   
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.

Re: Can't access remote Hive table from spark

2015-02-07 Thread Zhan Zhang
.mkdirs(FileSystem.java:595)

at 
org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:151)

at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35)

at 
org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:308)

at 
org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35)

at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80)

at org.apache.spark.deploy.yarn.ClientBase$class.run(ClientBase.scala:501)

at org.apache.spark.deploy.yarn.Client.run(Client.scala:35)

at org.apache.spark.deploy.yarn.Client$.main(Client.scala:139)

at org.apache.spark.deploy.yarn.Client.main(Client.scala)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)

at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)

at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
 Permission denied: user=xiaobogu, access=WRITE, 
inode="/user":hdfs:hdfs:drwxr-xr-x

at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271)

at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:257)

at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:238)

at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:179)

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6515)

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6497)

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6449)

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4251)

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4221)

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4194)

at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813)

at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600)

at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)


at org.apache.hadoop.ipc.Client.call(Client.java:1410)

at org.apache.hadoop.ipc.Client.call(Client.java:1363)

at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)

at com.sun.proxy.$Proxy17.mkdirs(Unknown Source)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)

at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)

at com.sun.proxy.$Proxy17.mkdirs(Unknown Source)

at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:500)

at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2553)

... 24 more

[xiaobogu@lix1 spark]$



-- Original --
From:  "Zhan Zhang";mailto:zzh...@hortonworks.com>>;
Send time: Friday, Feb 6, 2015 2:55 PM
To: ""mailto:guxiaobo1...@qq.com>>;
Cc: 
"user@spark.apache.org<mailto:user@spark.apache.org>"mailto:user@spark.apache.org>>;
 "Cheng Lian"mailto:lian.cs@gmail.com>>;
Subject:  Re: Can't access remote Hive table from spark

Not sure spark standalone mode. But on spark-on-yarn, it should work. You can 
check following link:

 http://hortonworks.com/hadoop-tutorial/using-apache-spark-hdp/

Thanks.

Zhan Zha

Re: Can't access remote Hive table from spark

2015-02-07 Thread Ted Yu
gMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:606)
>
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
>
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
>
> at com.sun.proxy.$Proxy17.mkdirs(Unknown Source)
>
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:500)
>
> at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2553)
>
> ... 24 more
>
> [xiaobogu@lix1 spark]$
>
>
>
>
> -- Original --
> *From: * "Zhan Zhang";;
> *Send time:* Friday, Feb 6, 2015 2:55 PM
> *To:* "";
> *Cc:* "user@spark.apache.org"; "Cheng Lian"<
> lian.cs@gmail.com>;
> *Subject: * Re: Can't access remote Hive table from spark
>
> Not sure spark standalone mode. But on spark-on-yarn, it should work. You
> can check following link:
>
>   http://hortonworks.com/hadoop-tutorial/using-apache-spark-hdp/
>
>  Thanks.
>
>  Zhan Zhang
>
>  On Feb 5, 2015, at 5:02 PM, Cheng Lian  wrote:
>
>   Please note that Spark 1.2.0 *only* support Hive 0.13.1 *or* 0.12.0,
> none of other versions are supported.
>
> Best,
> Cheng
>
> On 1/25/15 12:18 AM, guxiaobo1982 wrote:
>
>
>  Hi,
> I built and started a single node standalone Spark 1.2.0 cluster along
> with a single node Hive 0.14.0 instance installed by Ambari 1.17.0. On the
> Spark and Hive node I can create and query tables inside Hive, and on
> remote machines I can submit the SparkPi example to the Spark master. But
> I failed to run the following example code :
>
>  public class SparkTest {
>
> public static void main(String[] args)
>
> {
>
> String appName= "This is a test application";
>
> String master="spark://lix1.bh.com:7077";
>
>  SparkConf conf = new SparkConf().setAppName(appName).setMaster(master);
>
> JavaSparkContext sc = new JavaSparkContext(conf);
>
>  JavaHiveContext sqlCtx = new
> org.apache.spark.sql.hive.api.java.JavaHiveContext(sc);
>
> //sqlCtx.sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)");
>
> //sqlCtx.sql("LOAD DATA LOCAL INPATH '/opt/spark/examples/src
> /main/resources/kv1.txt' INTO TABLE src");
>
> // Queries are expressed in HiveQL.
>
> List rows = sqlCtx.sql("FROM src SELECT key, value").collect();
>
> System.out.print("I got " + rows.size() + " rows \r\n");
>
> sc.close();}
>
> }
>
>
>  Exception in thread "main"
> org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found
> src
>
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:980)
>
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950)
>
> at org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(
> HiveMetastoreCatalog.scala:70)
>
> at org.apache.spark.sql.hive.HiveContext$anon$2.org
> $apache$spark$sql$catalyst$analysis$OverrideCatalog$super$lookupRelation(
> HiveContext.scala:253)
>
> at
> org.apache.spark.sql.catalyst.analysis.OverrideCatalog$anonfun$lookupRelation$3.apply(
> Catalog.scala:141)
>
> at
> org.apache.spark.sql.catalyst.analysis.OverrideCatalog$anonfun$lookupRelation$3.apply(
> Catalog.scala:141)
>
> at scala.Option.getOrElse(Option.scala:120)
>
> at
> org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(
> Catalog.scala:141)
>
> at org.apache.spark.sql.hive.HiveContext$anon$2.lookupRelation(
> HiveContext.scala:253)
>
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$anonfun$apply$5.applyOrElse(
> Analyzer.scala:143)
>
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$anonfun$apply$5.applyOrElse(
> Analyzer.scala:138)
>
> at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(
> TreeNode.scala:144)
>
> at org.apache.spark.sql.catalyst.trees.TreeNode$anonfun$4.apply(
> TreeNode.scala:162)
>
> at scala.collection.Iterator$anon$11.next(Iterator.scala:328)
>
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>
> at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48
> )
>
> at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(
> ArrayBuffer.scala:103)
>
> at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47
> )
>
> at scala.collection.TraversableOnce$class.to(TraversableOn

Re: Can't access remote Hive table from spark

2015-02-07 Thread guxiaobo1982
rom:  "Zhan Zhang";;
Send time: Friday, Feb 6, 2015 2:55 PM
To: ""; 
Cc: "user@spark.apache.org"; "Cheng 
Lian"; 
Subject:  Re: Can't access remote Hive table from spark



 Not sure spark standalone mode. But on spark-on-yarn, it should work. You can 
check following link: 
 
  http://hortonworks.com/hadoop-tutorial/using-apache-spark-hdp/
 
 
 Thanks.
 
 
 Zhan Zhang
 
  On Feb 5, 2015, at 5:02 PM, Cheng Lian  wrote:
 

Please note that Spark 1.2.0 only support Hive 0.13.1 or 0.12.0, none of other 
versions are supported.
 
Best,
 Cheng
 
On 1/25/15 12:18 AM, guxiaobo1982 wrote:
 
 
  
 
  Hi,
 I built and started a single node standalone Spark 1.2.0 cluster along with a 
single node Hive 0.14.0 instance installed by Ambari 1.17.0. On the Spark and 
Hive node I can create and query tables inside Hive, and on remote machines I 
can submit the SparkPi example  to the Spark master. But I failed to run the 
following example code :
 
 
  
public class SparkTest {
 
public  static void main(String[] args)
 
{
 
String appName= "This is a test application";
 
String master="spark://lix1.bh.com:7077";
 
 
 
SparkConf conf = new SparkConf().setAppName(appName).setMaster(master);
 
JavaSparkContext sc = new JavaSparkContext(conf);
 
 
 
JavaHiveContext sqlCtx = new 
org.apache.spark.sql.hive.api.java.JavaHiveContext(sc);
 
//sqlCtx.sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)");
 
//sqlCtx.sql("LOAD DATA LOCAL INPATH 
'/opt/spark/examples/src/main/resources/kv1.txt' INTO TABLE src");
 
// Queries are expressed in HiveQL.
 
List rows = sqlCtx.sql("FROM src SELECT key, value").collect();
 
System.out.print("I got " + rows.size() + " rows \r\n");
 
sc.close();}
 
}
 

 
 
Exception in thread "main" 
org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found src
 
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:980)
 
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950)
 
at 
org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:70)
 
at 
org.apache.spark.sql.hive.HiveContext$anon$2.org$apache$spark$sql$catalyst$analysis$OverrideCatalog$super$lookupRelation(HiveContext.scala:253)
 
at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$anonfun$lookupRelation$3.apply(Catalog.scala:141)
 
at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$anonfun$lookupRelation$3.apply(Catalog.scala:141)
 
at scala.Option.getOrElse(Option.scala:120)
 
at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:141)
 
at 
org.apache.spark.sql.hive.HiveContext$anon$2.lookupRelation(HiveContext.scala:253)
 
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$anonfun$apply$5.applyOrElse(Analyzer.scala:143)
 
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$anonfun$apply$5.applyOrElse(Analyzer.scala:138)
 
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144)
 
at 
org.apache.spark.sql.catalyst.trees.TreeNode$anonfun$4.apply(TreeNode.scala:162)
 
at scala.collection.Iterator$anon$11.next(Iterator.scala:328)
 
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
 
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
 
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
 
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
 
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
 
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
 
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
 
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
 
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
 
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
 
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
 
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:191)
 
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:147)
 
at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:135)
 
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:138)
 
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:137)
 
at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$apply$1$anonfun$apply$2.apply(RuleExecutor.scala:61)
 
at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$apply$1$anonfun$apply$2.apply(RuleExecutor.scala:59)
 
at 
scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)
 
at scala.collection.immutable.List.foldLeft(List.scala:84)
 
at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$apply$1.apply(RuleExecutor.scala:59)

Re: Can't access remote Hive table from spark

2015-02-05 Thread Zhan Zhang
Not sure spark standalone mode. But on spark-on-yarn, it should work. You can 
check following link:

 http://hortonworks.com/hadoop-tutorial/using-apache-spark-hdp/

Thanks.

Zhan Zhang

On Feb 5, 2015, at 5:02 PM, Cheng Lian 
mailto:lian.cs@gmail.com>> wrote:


Please note that Spark 1.2.0 only support Hive 0.13.1 or 0.12.0, none of other 
versions are supported.

Best,
Cheng

On 1/25/15 12:18 AM, guxiaobo1982 wrote:


Hi,
I built and started a single node standalone Spark 1.2.0 cluster along with a 
single node Hive 0.14.0 instance installed by Ambari 1.17.0. On the Spark and 
Hive node I can create and query tables inside Hive, and on remote machines I 
can submit the SparkPi example to the Spark master. But I failed to run the 
following example code :


public class SparkTest {

public static void main(String[] args)

{

String appName= "This is a test application";

String master="spark://lix1.bh.com:7077";


SparkConf conf = new SparkConf().setAppName(appName).setMaster(master);

JavaSparkContext sc = new JavaSparkContext(conf);


JavaHiveContext sqlCtx = new 
org.apache.spark.sql.hive.api.java.JavaHiveContext(sc);

//sqlCtx.sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)");

//sqlCtx.sql("LOAD DATA LOCAL INPATH 
'/opt/spark/examples/src/main/resources/kv1.txt' INTO TABLE src");

// Queries are expressed in HiveQL.

List rows = sqlCtx.sql("FROM src SELECT key, value").collect();

System.out.print("I got " + rows.size() + " rows \r\n");

sc.close();}

}


Exception in thread "main" 
org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found src

at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:980)

at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950)

at 
org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:70)

at 
org.apache.spark.sql.hive.HiveContext$anon$2.org$apache$spark$sql$catalyst$analysis$OverrideCatalog$super$lookupRelation(HiveContext.scala:253)

at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$anonfun$lookupRelation$3.apply(Catalog.scala:141)

at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$anonfun$lookupRelation$3.apply(Catalog.scala:141)

at scala.Option.getOrElse(Option.scala:120)

at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:141)

at 
org.apache.spark.sql.hive.HiveContext$anon$2.lookupRelation(HiveContext.scala:253)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$anonfun$apply$5.applyOrElse(Analyzer.scala:143)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$anonfun$apply$5.applyOrElse(Analyzer.scala:138)

at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144)

at 
org.apache.spark.sql.catalyst.trees.TreeNode$anonfun$4.apply(TreeNode.scala:162)

at scala.collection.Iterator$anon$11.next(Iterator.scala:328)

at scala.collection.Iterator$class.foreach(Iterator.scala:727)

at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)

at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)

at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)

at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)

at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)

at scala.collection.AbstractIterator.to(Iterator.scala:1157)

at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)

at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)

at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)

at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)

at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:191)

at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:147)

at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:135)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:138)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:137)

at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$apply$1$anonfun$apply$2.apply(RuleExecutor.scala:61)

at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$apply$1$anonfun$apply$2.apply(RuleExecutor.scala:59)

at 
scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)

at scala.collection.immutable.List.foldLeft(List.scala:84)

at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$apply$1.apply(RuleExecutor.scala:59)

at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$apply$1.apply(RuleExecutor.scala:51)

at scala.collection.immutable.List.foreach(List.scala:318)

at org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)

at 
org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:411)

at org.apache.spark.s

Re: RE: Can't access remote Hive table from spark

2015-02-05 Thread Skanda
Hi,

My spark-env.sh has the following entries with respect to classpath:

export SPARK_CLASSPATH=$SPARK_CLASSPATH:/usr/lib/hive/lib/*:/etc/hive/conf/

-Skanda

On Sun, Feb 1, 2015 at 11:45 AM, guxiaobo1982  wrote:

> Hi Skanda,
>
> How do set up your SPARK_CLASSPATH?
>
> I add the following line to my SPARK_HOME/conf/spark-env.sh , and still
> got the same error.
>
> export SPARK_CLASSPATH=${SPARK_CLASSPATH}:/etc/hive/conf
>
>
> -- Original --
> *From: * "Skanda Prasad";;
> *Send time:* Monday, Jan 26, 2015 7:41 AM
> *To:* ""; "user@spark.apache.org"<
> user@spark.apache.org>;
> *Subject: * RE: Can't access remote Hive table from spark
>
> This happened to me as well, putting hive-site.xml inside conf doesn't
> seem to work. Instead I added /etc/hive/conf to SPARK_CLASSPATH and it
> worked. You can try this approach.
>
> -Skanda
> --
> From: guxiaobo1982 
> Sent: ‎25-‎01-‎2015 13:50
> To: user@spark.apache.org
> Subject: Can't access remote Hive table from spark
>
> Hi,
> I built and started a single node standalone Spark 1.2.0 cluster along
> with a single node Hive 0.14.0 instance installed by Ambari 1.17.0. On the
> Spark and Hive node I can create and query tables inside Hive, and on
> remote machines I can submit the SparkPi example to the Spark master. But
> I failed to run the following example code :
>
> public class SparkTest {
>
> public static void main(String[] args)
>
> {
>
>  String appName= "This is a test application";
>
>  String master="spark://lix1.bh.com:7077";
>
>   SparkConf conf = new SparkConf().setAppName(appName).setMaster(master);
>
>  JavaSparkContext sc = new JavaSparkContext(conf);
>
>   JavaHiveContext sqlCtx = new
> org.apache.spark.sql.hive.api.java.JavaHiveContext(sc);
>
>  //sqlCtx.sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)");
>
>  //sqlCtx.sql("LOAD DATA LOCAL INPATH '/opt/spark/examples/src
> /main/resources/kv1.txt' INTO TABLE src");
>
>  // Queries are expressed in HiveQL.
>
> List rows = sqlCtx.sql("FROM src SELECT key, value").collect();
>
> System.out.print("I got " + rows.size() + " rows \r\n");
>
>  sc.close();}
>
> }
>
>
> Exception in thread "main"
> org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found
> src
>
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:980)
>
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950)
>
> at org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(
> HiveMetastoreCatalog.scala:70)
>
> at org.apache.spark.sql.hive.HiveContext$$anon$2.org
> $apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(
> HiveContext.scala:253)
>
> at
> org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(
> Catalog.scala:141)
>
> at
> org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(
> Catalog.scala:141)
>
> at scala.Option.getOrElse(Option.scala:120)
>
> at
> org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(
> Catalog.scala:141)
>
> at org.apache.spark.sql.hive.HiveContext$$anon$2.lookupRelation(
> HiveContext.scala:253)
>
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$5.applyOrElse(
> Analyzer.scala:143)
>
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$5.applyOrElse(
> Analyzer.scala:138)
>
> at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(
> TreeNode.scala:144)
>
> at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(
> TreeNode.scala:162)
>
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>
> at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48
> )
>
> at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(
> ArrayBuffer.scala:103)
>
> at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47
> )
>
> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
>
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
>
> at scala.collection.TraversableOnce$class.toBuffer(
> TraversableOnce.scala:265)
>
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
>
> at scala.collection.TraversableOnce$class.toArray(
> Traversab

Re: Can't access remote Hive table from spark

2015-02-05 Thread Cheng Lian
Please note that Spark 1.2.0 /only/ support Hive 0.13.1 /or/ 0.12.0, 
none of other versions are supported.


Best,
Cheng

On 1/25/15 12:18 AM, guxiaobo1982 wrote:


Hi,
I built and started a single node standalone Spark 1.2.0 cluster along 
with a single node Hive 0.14.0 instance installed by Ambari 1.17.0. On 
the Spark and Hive node I can create and query tables inside Hive, and 
on remote machines I can submit the SparkPi example to the Spark 
master. But I failed to run the following example code :


public class SparkTest {

public static void main(String[] args)

{

String appName= "This is a test application";

String master="spark://lix1.bh.com:7077";

SparkConf conf = new SparkConf().setAppName(appName).setMaster(master);

JavaSparkContext sc = new JavaSparkContext(conf);

JavaHiveContext sqlCtx = new 
org.apache.spark.sql.hive.api.java.JavaHiveContext(sc);


//sqlCtx.sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)");

//sqlCtx.sql("LOAD DATA LOCAL INPATH 
'/opt/spark/examples/src/main/resources/kv1.txt' INTO TABLE src");


// Queries are expressed in HiveQL.

List rows = sqlCtx.sql("FROM src SELECT key, value").collect();

System.out.print("I got "+ rows.size() + " rows \r\n");

sc.close();}

}


Exception in thread "main" 
org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not 
found src


at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:980)

at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950)

at 
org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:70)


at 
org.apache.spark.sql.hive.HiveContext$anon$2.org$apache$spark$sql$catalyst$analysis$OverrideCatalog$super$lookupRelation(HiveContext.scala:253)


at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$anonfun$lookupRelation$3.apply(Catalog.scala:141)


at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$anonfun$lookupRelation$3.apply(Catalog.scala:141)


at scala.Option.getOrElse(Option.scala:120)

at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:141)


at 
org.apache.spark.sql.hive.HiveContext$anon$2.lookupRelation(HiveContext.scala:253)


at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$anonfun$apply$5.applyOrElse(Analyzer.scala:143)


at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$anonfun$apply$5.applyOrElse(Analyzer.scala:138)


at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144)


at 
org.apache.spark.sql.catalyst.trees.TreeNode$anonfun$4.apply(TreeNode.scala:162)


at scala.collection.Iterator$anon$11.next(Iterator.scala:328)

at scala.collection.Iterator$class.foreach(Iterator.scala:727)

at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)

at 
scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)


at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)


at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)


at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)

at scala.collection.AbstractIterator.to(Iterator.scala:1157)

at 
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)


at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)

at 
scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)


at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)

at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:191)


at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:147)


at 
org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:135)


at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:138)


at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:137)


at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$apply$1$anonfun$apply$2.apply(RuleExecutor.scala:61)


at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$apply$1$anonfun$apply$2.apply(RuleExecutor.scala:59)


at 
scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)


at scala.collection.immutable.List.foldLeft(List.scala:84)

at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$apply$1.apply(RuleExecutor.scala:59)


at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$apply$1.apply(RuleExecutor.scala:51)


at scala.collection.immutable.List.foreach(List.scala:318)

at 
org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)


at 
org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:411)


at 
org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:411)


at 
org.apache.spark.sql.SQLContext$QueryExecution.withCachedData$lzycompute(SQLContext.scala:412)


at 
org.apache.spark.sql.SQLContext$QueryExecution.withCachedData(SQLContext.s

Re: Can't access remote Hive table from spark

2015-02-01 Thread guxiaobo1982
One friend told me that I should add the hive-site.xml file to the --files 
option of spark-submit command, but how can I run and debug my program inside 
eclipse?






-- Original --
From:  "guxiaobo1982";;
Send time: Sunday, Feb 1, 2015 4:18 PM
To: "Jörn Franke"; 

Subject:  Re: Can't access remote Hive table from spark



I am sorry , i forget to say that I have created the table manually .


在 2015年2月1日,下午4:14,Jörn Franke  写道:



You commented the line which is suppose to create a table.
 Le 25 janv. 2015 09:20, "guxiaobo1982"  a écrit :
Hi,
I built and started a single node standalone Spark 1.2.0 cluster along with a 
single node Hive 0.14.0 instance installed by Ambari 1.17.0. On the Spark and 
Hive node I can create and query tables inside Hive, and on remote machines I 
can submit the SparkPi example to the Spark master. But I failed to run the 
following example code :


 
public class SparkTest {
 
public static void main(String[] args)
 
{
 
String appName= "This is a test application";
 
String master="spark://lix1.bh.com:7077";
 

 
SparkConf conf = new 
SparkConf().setAppName(appName).setMaster(master);
 
JavaSparkContext sc = new JavaSparkContext(conf);
 

 
JavaHiveContext sqlCtx = new 
org.apache.spark.sql.hive.api.java.JavaHiveContext(sc);
 
//sqlCtx.sql("CREATE TABLE IF NOT EXISTS src (key INT, value 
STRING)");
 
//sqlCtx.sql("LOAD DATA LOCAL INPATH 
'/opt/spark/examples/src/main/resources/kv1.txt' INTO TABLE src");
 
// Queries are expressed in HiveQL.
 
List rows = sqlCtx.sql("FROM src SELECT key, value").collect();
 
System.out.print("I got " + rows.size() + " rows \r\n");
 
sc.close();}
 
}




Exception in thread "main" 
org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found src

at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:980)

at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950)

at 
org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:70)

at 
org.apache.spark.sql.hive.HiveContext$$anon$2.org$apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(HiveContext.scala:253)

at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:141)

at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:141)

at scala.Option.getOrElse(Option.scala:120)

at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:141)

at 
org.apache.spark.sql.hive.HiveContext$$anon$2.lookupRelation(HiveContext.scala:253)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$5.applyOrElse(Analyzer.scala:143)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$5.applyOrElse(Analyzer.scala:138)

at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144)

at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:162)

at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)

at scala.collection.Iterator$class.foreach(Iterator.scala:727)

at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)

at 
scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)

at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)

at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)

at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)

at scala.collection.AbstractIterator.to(Iterator.scala:1157)

at 
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)

at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)

at 
scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)

at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)

at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:191)

at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:147)

at 
org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:135)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:138)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:137)

at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2

Re: RE: Can't access remote Hive table from spark

2015-01-31 Thread guxiaobo1982
The following line does not work too 
export SPARK_CLASSPATH=/etc/hive/conf




-- Original --
From:  "guxiaobo1982";;
Send time: Sunday, Feb 1, 2015 2:15 PM
To: "Skanda Prasad"; 
"user@spark.apache.org"; 
Cc: "徐涛"<77044...@qq.com>; 
Subject:  Re: RE: Can't access remote Hive table from spark



Hi Skanda,


How do set up your SPARK_CLASSPATH?


I add the following line to my SPARK_HOME/conf/spark-env.sh , and still got the 
same error.
 
export SPARK_CLASSPATH=${SPARK_CLASSPATH}:/etc/hive/conf





-- Original --
From:  "Skanda Prasad";;
Send time: Monday, Jan 26, 2015 7:41 AM
To: ""; "user@spark.apache.org"; 

Subject:  RE: Can't access remote Hive table from spark



This happened to me as well, putting hive-site.xml inside conf doesn't seem to 
work. Instead I added /etc/hive/conf to SPARK_CLASSPATH and it worked. You can 
try this approach.

-Skanda


From: guxiaobo1982
Sent: ‎25-‎01-‎2015 13:50
To: user@spark.apache.org
Subject: Can't access remote Hive table from spark


Hi,
I built and started a single node standalone Spark 1.2.0 cluster along with a 
single node Hive 0.14.0 instance installed by Ambari 1.17.0. On the Spark and 
Hive node I can create and query tables inside Hive, and on remote machines I 
can submit the SparkPi example to the Spark master. But I failed to run the 
following example code :


 
public class SparkTest {
 
public static void main(String[] args)
 
{
 
String appName= "This is a test application";
 
String master="spark://lix1.bh.com:7077";
 

 
SparkConf conf = new 
SparkConf().setAppName(appName).setMaster(master);
 
JavaSparkContext sc = new JavaSparkContext(conf);
 

 
JavaHiveContext sqlCtx = new 
org.apache.spark.sql.hive.api.java.JavaHiveContext(sc);
 
//sqlCtx.sql("CREATE TABLE IF NOT EXISTS src (key INT, value 
STRING)");
 
//sqlCtx.sql("LOAD DATA LOCAL INPATH 
'/opt/spark/examples/src/main/resources/kv1.txt' INTO TABLE src");
 
// Queries are expressed in HiveQL.
 
List rows = sqlCtx.sql("FROM src SELECT key, value").collect();
 
System.out.print("I got " + rows.size() + " rows \r\n");
 
sc.close();}
 
}




Exception in thread "main" 
org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found src

at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:980)

at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950)

at 
org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:70)

at 
org.apache.spark.sql.hive.HiveContext$$anon$2.org$apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(HiveContext.scala:253)

at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:141)

at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:141)

at scala.Option.getOrElse(Option.scala:120)

at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:141)

at 
org.apache.spark.sql.hive.HiveContext$$anon$2.lookupRelation(HiveContext.scala:253)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$5.applyOrElse(Analyzer.scala:143)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$5.applyOrElse(Analyzer.scala:138)

at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144)

at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:162)

at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)

at scala.collection.Iterator$class.foreach(Iterator.scala:727)

at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)

at 
scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)

at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)

at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)

at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)

at scala.collection.AbstractIterator.to(Iterator.scala:1157)

at 
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)

at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)

at 
scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)

at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)

at 
org.

Re: RE: Can't access remote Hive table from spark

2015-01-31 Thread guxiaobo1982
Hi Skanda,


How do set up your SPARK_CLASSPATH?


I add the following line to my SPARK_HOME/conf/spark-env.sh , and still got the 
same error.
 
export SPARK_CLASSPATH=${SPARK_CLASSPATH}:/etc/hive/conf





-- Original --
From:  "Skanda Prasad";;
Send time: Monday, Jan 26, 2015 7:41 AM
To: ""; "user@spark.apache.org"; 

Subject:  RE: Can't access remote Hive table from spark



This happened to me as well, putting hive-site.xml inside conf doesn't seem to 
work. Instead I added /etc/hive/conf to SPARK_CLASSPATH and it worked. You can 
try this approach.

-Skanda


From: guxiaobo1982
Sent: ‎25-‎01-‎2015 13:50
To: user@spark.apache.org
Subject: Can't access remote Hive table from spark


Hi,
I built and started a single node standalone Spark 1.2.0 cluster along with a 
single node Hive 0.14.0 instance installed by Ambari 1.17.0. On the Spark and 
Hive node I can create and query tables inside Hive, and on remote machines I 
can submit the SparkPi example to the Spark master. But I failed to run the 
following example code :


 
public class SparkTest {
 
public static void main(String[] args)
 
{
 
String appName= "This is a test application";
 
String master="spark://lix1.bh.com:7077";
 

 
SparkConf conf = new 
SparkConf().setAppName(appName).setMaster(master);
 
JavaSparkContext sc = new JavaSparkContext(conf);
 

 
JavaHiveContext sqlCtx = new 
org.apache.spark.sql.hive.api.java.JavaHiveContext(sc);
 
//sqlCtx.sql("CREATE TABLE IF NOT EXISTS src (key INT, value 
STRING)");
 
//sqlCtx.sql("LOAD DATA LOCAL INPATH 
'/opt/spark/examples/src/main/resources/kv1.txt' INTO TABLE src");
 
// Queries are expressed in HiveQL.
 
List rows = sqlCtx.sql("FROM src SELECT key, value").collect();
 
System.out.print("I got " + rows.size() + " rows \r\n");
 
sc.close();}
 
}




Exception in thread "main" 
org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found src

at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:980)

at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950)

at 
org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:70)

at 
org.apache.spark.sql.hive.HiveContext$$anon$2.org$apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(HiveContext.scala:253)

at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:141)

at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:141)

at scala.Option.getOrElse(Option.scala:120)

at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:141)

at 
org.apache.spark.sql.hive.HiveContext$$anon$2.lookupRelation(HiveContext.scala:253)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$5.applyOrElse(Analyzer.scala:143)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$5.applyOrElse(Analyzer.scala:138)

at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144)

at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:162)

at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)

at scala.collection.Iterator$class.foreach(Iterator.scala:727)

at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)

at 
scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)

at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)

at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)

at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)

at scala.collection.AbstractIterator.to(Iterator.scala:1157)

at 
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)

at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)

at 
scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)

at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)

at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:191)

at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:147)

at 
org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:135)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:138)

at 
org.apache.spark.sql.cataly

RE: Can't access remote Hive table from spark

2015-01-25 Thread Skanda Prasad
This happened to me as well, putting hive-site.xml inside conf doesn't seem to 
work. Instead I added /etc/hive/conf to SPARK_CLASSPATH and it worked. You can 
try this approach.

-Skanda

-Original Message-
From: "guxiaobo1982" 
Sent: ‎25-‎01-‎2015 13:50
To: "user@spark.apache.org" 
Subject: Can't access remote Hive table from spark

Hi,
I built and started a single node standalone Spark 1.2.0 cluster along with a 
single node Hive 0.14.0 instance installed by Ambari 1.17.0. On the Spark and 
Hive node I can create and query tables inside Hive, and on remote machines I 
can submit the SparkPi example to the Spark master. But I failed to run the 
following example code :


public class SparkTest {
public static void main(String[] args)
{
String appName= "This is a test application";
String master="spark://lix1.bh.com:7077";
SparkConf conf = new SparkConf().setAppName(appName).setMaster(master);
JavaSparkContext sc = new JavaSparkContext(conf);
JavaHiveContext sqlCtx = new 
org.apache.spark.sql.hive.api.java.JavaHiveContext(sc);
//sqlCtx.sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)");
//sqlCtx.sql("LOAD DATA LOCAL INPATH 
'/opt/spark/examples/src/main/resources/kv1.txt' INTO TABLE src");
// Queries are expressed in HiveQL.
List rows = sqlCtx.sql("FROM src SELECT key, value").collect();
System.out.print("I got " + rows.size() + " rows \r\n");
sc.close();}
}


Exception in thread "main" 
org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found src
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:980)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950)
at 
org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:70)
at 
org.apache.spark.sql.hive.HiveContext$$anon$2.org$apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(HiveContext.scala:253)
at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:141)
at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:141)
at scala.Option.getOrElse(Option.scala:120)
at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:141)
at 
org.apache.spark.sql.hive.HiveContext$$anon$2.lookupRelation(HiveContext.scala:253)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$5.applyOrElse(Analyzer.scala:143)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$5.applyOrElse(Analyzer.scala:138)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:162)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:191)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:147)
at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:135)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:138)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:137)
at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61)
at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59)
at 
scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)
at scala.collection.immutable.List.foldLeft(List.scala:84)
at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59)
at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)
at 
org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:411)
at org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:411)
at 
org.apache.spark.sql.SQLContext$QueryExecution.withCachedData$lzycompute(SQ