[
https://issues.apache.org/jira/browse/SPARK-8409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595716#comment-14595716
]
Arun commented on SPARK-8409:
-----------------------------
Dear Shivram,
While using spark shell, to install pacakage ".\spark-shell --packages
com.databricks:spark-csv_2.11:1.0.3"
The csv packages got sucessfully installed.
The problem is with sparkR shell only. kindly get the ways to get installed in
sparkR shell.
E:\spark-1.4.0-bin-hadoop2.6\bin>.\spark-shell --packages com.databricks:spark-c
sv_2.11:1.0.3
Ivy Default Cache set to: C:\Users\acer1\.ivy2\cache
The jars for the packages stored in: C:\Users\acer1\.ivy2\jars
:: loading settings :: url = jar:file:/E:/spark-1.4.0-bin-hadoop2.6/lib/spark-as
sembly-1.4.0-hadoop2.6.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
com.databricks#spark-csv_2.11 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
found com.databricks#spark-csv_2.11;1.0.3 in central
found org.apache.commons#commons-csv;1.1 in central
downloading https://repo1.maven.org/maven2/com/databricks/spark-csv_2.11/1.0.3/s
park-csv_2.11-1.0.3.jar ...
[SUCCESSFUL ] com.databricks#spark-csv_2.11;1.0.3!spark-csv_2.11.jar (70
5ms)
downloading https://repo1.maven.org/maven2/org/apache/commons/commons-csv/1.1/co
mmons-csv-1.1.jar ...
[SUCCESSFUL ] org.apache.commons#commons-csv;1.1!commons-csv.jar (479ms)
:: resolution report :: resolve 11565ms :: artifacts dl 1200ms
:: modules in use:
com.databricks#spark-csv_2.11;1.0.3 from central in [default]
org.apache.commons#commons-csv;1.1 from central in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 2 | 2 | 2 | 0 || 2 | 2 |
---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
confs: [default]
2 artifacts copied, 0 already retrieved (90kB/63ms)
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.li
b.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more in
fo.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/06/22 15:33:22 INFO SecurityManager: Changing view acls to: acer1
15/06/22 15:33:22 INFO SecurityManager: Changing modify acls to: acer1
15/06/22 15:33:22 INFO SecurityManager: SecurityManager: authentication disabled
; ui acls disabled; users with view permissions: Set(acer1); users with modify p
ermissions: Set(acer1)
15/06/22 15:33:22 INFO HttpServer: Starting HTTP Server
15/06/22 15:33:22 INFO Utils: Successfully started service 'HTTP class server' o
n port 53987.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.4.0
/_/
Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_71)
Type in expressions to have them evaluated.
Type :help for more information.
15/06/22 15:33:27 INFO SparkContext: Running Spark version 1.4.0
15/06/22 15:33:27 INFO SecurityManager: Changing view acls to: acer1
15/06/22 15:33:27 INFO SecurityManager: Changing modify acls to: acer1
15/06/22 15:33:27 INFO SecurityManager: SecurityManager: authentication disabled
; ui acls disabled; users with view permissions: Set(acer1); users with modify p
ermissions: Set(acer1)
15/06/22 15:33:28 INFO Slf4jLogger: Slf4jLogger started
15/06/22 15:33:28 INFO Remoting: Starting remoting
15/06/22 15:33:28 INFO Remoting: Remoting started; listening on addresses :[akka
.tcp://[email protected]:54000]
15/06/22 15:33:28 INFO Utils: Successfully started service 'sparkDriver' on port
54000.
15/06/22 15:33:28 INFO SparkEnv: Registering MapOutputTracker
15/06/22 15:33:28 INFO SparkEnv: Registering BlockManagerMaster
15/06/22 15:33:28 INFO DiskBlockManager: Created local directory at C:\Users\ace
r1\AppData\Local\Temp\spark-7805dd92-cc04-44f0-9b1c-2993939f7b21\blockmgr-b7c44c
e9-7ad7-4a03-b041-7a0aa491de10
15/06/22 15:33:28 INFO MemoryStore: MemoryStore started with capacity 265.4 MB
15/06/22 15:33:28 INFO HttpFileServer: HTTP File server directory is C:\Users\ac
er1\AppData\Local\Temp\spark-7805dd92-cc04-44f0-9b1c-2993939f7b21\httpd-a833b562
-71a5-400e-85e3-821f4760348c
15/06/22 15:33:28 INFO HttpServer: Starting HTTP Server
15/06/22 15:33:28 INFO Utils: Successfully started service 'HTTP file server' on
port 54001.
15/06/22 15:33:28 INFO SparkEnv: Registering OutputCommitCoordinator
15/06/22 15:33:28 INFO Utils: Successfully started service 'SparkUI' on port 404
0.
15/06/22 15:33:28 INFO SparkUI: Started SparkUI at http://192.168.88.1:4040
15/06/22 15:33:28 INFO SparkContext: Added JAR file:/C:/Users/acer1/.ivy2/jars/c
om.databricks_spark-csv_2.11-1.0.3.jar at http://192.168.88.1:54001/jars/com.dat
abricks_spark-csv_2.11-1.0.3.jar with timestamp 1434967408937
15/06/22 15:33:28 INFO SparkContext: Added JAR file:/C:/Users/acer1/.ivy2/jars/o
rg.apache.commons_commons-csv-1.1.jar at http://192.168.88.1:54001/jars/org.apac
he.commons_commons-csv-1.1.jar with timestamp 1434967408937
15/06/22 15:33:28 INFO Executor: Starting executor ID driver on host localhost
15/06/22 15:33:28 INFO Executor: Using REPL class URI: http://192.168.88.1:53987
15/06/22 15:33:29 INFO Utils: Successfully started service 'org.apache.spark.net
work.netty.NettyBlockTransferService' on port 54020.
15/06/22 15:33:29 INFO NettyBlockTransferService: Server created on 54020
15/06/22 15:33:29 INFO BlockManagerMaster: Trying to register BlockManager
15/06/22 15:33:29 INFO BlockManagerMasterEndpoint: Registering block manager loc
alhost:54020 with 265.4 MB RAM, BlockManagerId(driver, localhost, 54020)
15/06/22 15:33:29 INFO BlockManagerMaster: Registered BlockManager
15/06/22 15:33:29 INFO SparkILoop: Created spark context..
Spark context available as sc.
15/06/22 15:33:30 INFO HiveContext: Initializing execution hive, version 0.13.1
15/06/22 15:33:31 INFO HiveMetaStore: 0: Opening raw store with implemenation cl
ass:org.apache.hadoop.hive.metastore.ObjectStore
15/06/22 15:33:31 INFO ObjectStore: ObjectStore, initialize called
15/06/22 15:33:31 INFO Persistence: Property datanucleus.cache.level2 unknown -
will be ignored
15/06/22 15:33:31 INFO Persistence: Property hive.metastore.integral.jdo.pushdow
n unknown - will be ignored
15/06/22 15:33:32 WARN Connection: BoneCP specified but not present in CLASSPATH
(or one of dependencies)
15/06/22 15:33:32 WARN Connection: BoneCP specified but not present in CLASSPATH
(or one of dependencies)
15/06/22 15:33:43 INFO ObjectStore: Setting MetaStore object pin classes with hi
ve.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Data
base,Type,FieldSchema,Order"
15/06/22 15:33:43 INFO MetaStoreDirectSql: MySQL check failed, assuming we are n
ot on mysql: Lexical error at line 1, column 5. Encountered: "@" (64), after :
"".
15/06/22 15:33:46 INFO Datastore: The class "org.apache.hadoop.hive.metastore.mo
del.MFieldSchema" is tagged as "embedded-only" so does not have its own datastor
e table.
15/06/22 15:33:46 INFO Datastore: The class "org.apache.hadoop.hive.metastore.mo
del.MOrder" is tagged as "embedded-only" so does not have its own datastore tabl
e.
15/06/22 15:33:54 INFO Datastore: The class "org.apache.hadoop.hive.metastore.mo
del.MFieldSchema" is tagged as "embedded-only" so does not have its own datastor
e table.
15/06/22 15:33:54 INFO Datastore: The class "org.apache.hadoop.hive.metastore.mo
del.MOrder" is tagged as "embedded-only" so does not have its own datastore tabl
e.
15/06/22 15:33:56 INFO ObjectStore: Initialized ObjectStore
15/06/22 15:33:56 WARN ObjectStore: Version information not found in metastore.
hive.metastore.schema.verification is not enabled so recording the schema versio
n 0.13.1aa
15/06/22 15:33:57 INFO HiveMetaStore: Added admin role in metastore
15/06/22 15:33:57 INFO HiveMetaStore: Added public role in metastore
15/06/22 15:33:58 INFO HiveMetaStore: No user is added in admin role, since conf
ig is empty
15/06/22 15:33:58 INFO SessionState: No Tez session required at this point. hive
.execution.engine=mr.
15/06/22 15:33:58 INFO SparkILoop: Created sql context (with Hive support)..
SQL context available as sqlContext.
Thnaks,
Arun Gunalan.B
> In windows cant able to read .csv or .json files using read.df()
> -----------------------------------------------------------------
>
> Key: SPARK-8409
> URL: https://issues.apache.org/jira/browse/SPARK-8409
> Project: Spark
> Issue Type: Bug
> Components: SparkR, Windows
> Affects Versions: 1.4.0
> Environment: sparkR API
> Reporter: Arun
> Priority: Critical
>
> Hi,
> In SparkR shell, I invoke:
> > mydf<-read.df(sqlContext, "/home/esten/ami/usaf.json", source="json",
> > header="false")
> I have tried various filetypes (csv, txt), all fail.
> in sparkR of spark 1.4 for eg.) df_1<- read.df(sqlContext,
> "E:/setup/spark-1.4.0-bin-hadoop2.6/spark-1.4.0-bin-hadoop2.6/examples/src/main/resources/nycflights13.csv",
> source = "csv")
> RESPONSE: "ERROR RBackendHandler: load on 1 failed"
> BELOW THE WHOLE RESPONSE:
> 15/06/16 08:09:13 INFO MemoryStore: ensureFreeSpace(177600) called with
> curMem=0, maxMem=278302556
> 15/06/16 08:09:13 INFO MemoryStore: Block broadcast_0 stored as values in
> memory (estimated size 173.4 KB, free 265.2 MB)
> 15/06/16 08:09:13 INFO MemoryStore: ensureFreeSpace(16545) called with
> curMem=177600, maxMem=278302556
> 15/06/16 08:09:13 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes
> in memory (estimated size 16.2 KB, free 265.2 MB)
> 15/06/16 08:09:13 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory
> on localhost:37142 (size: 16.2 KB, free: 265.4 MB)
> 15/06/16 08:09:13 INFO SparkContext: Created broadcast 0 from load at
> NativeMethodAccessorImpl.java:-2
> 15/06/16 08:09:16 WARN DomainSocketFactory: The short-circuit local reads
> feature cannot be used because libhadoop cannot be loaded.
> 15/06/16 08:09:17 ERROR RBackendHandler: load on 1 failed
> java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:127)
>
> at
> org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:74)
> at
> org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:36)
> at
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>
> at
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>
> at
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
>
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>
> at
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
>
> at
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
>
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>
> at
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
>
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does
> not exist: hdfs://smalldata13.hdp:8020/home/esten/ami/usaf.json
> at
> org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:285)
>
> at
> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228)
> at
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313)
> at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207)
> at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
> at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
> at scala.Option.getOrElse(Option.scala:120)
> at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
> at
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
>
> at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
> at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
> at scala.Option.getOrElse(Option.scala:120)
> at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
> at
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
>
> at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
> at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
> at scala.Option.getOrElse(Option.scala:120)
> at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
> at
> org.apache.spark.rdd.RDD$$anonfun$treeAggregate$1.apply(RDD.scala:1069)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148)
>
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:109)
>
> at org.apache.spark.rdd.RDD.withScope(RDD.scala:286)
> at org.apache.spark.rdd.RDD.treeAggregate(RDD.scala:1067)
> at org.apache.spark.sql.json.InferSchema$.apply(InferSchema.scala:58)
> at
> org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:139)
>
> at
> org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:138)
>
> at scala.Option.getOrElse(Option.scala:120)
> at
> org.apache.spark.sql.json.JSONRelation.schema$lzycompute(JSONRelation.scala:137)
>
> at
> org.apache.spark.sql.json.JSONRelation.schema(JSONRelation.scala:137)
> at
> org.apache.spark.sql.sources.LogicalRelation.<init>(LogicalRelation.scala:30)
> at
> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:120)
> at org.apache.spark.sql.SQLContext.load(SQLContext.scala:1230)
> ... 25 more
> Error: returnStatus == 0 is not TRUE
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]