Sorry about the confusion I created . I just have started learning this week.
Silly me, I was actually writing the schema to a txt file and expecting
records. This is what I was supposed to do. Also if you could let me know
about adding the data from jsonFile/jsonRDD methods of hiveContext to hive
Thanks for replying .I was unable to figure out how after I use
jsonFile/jsonRDD be able to load data into a hive table. Also I was able to
save the SchemaRDD I got via hiveContext.sql(...).saveAsParquetFile(Path)
ie. save schemardd as parquetfile but when I tried to fetch data from
parquet file ba
The below part of code contains a part which creates a table in hive from
data and and another part below creates a Schema.
*Now if I try to save the quried data as a parquet file where
hctx.sql("Select * from sparkHive1") returns me a SchemaRDD
which contains records from table .*
hctx.sq
Oops , I guess , this is the right way to do it
mvn -Phive -Dhadoop.version=1.2.1 clean -DskipTests package
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Building-Spark-for-Hive-The-requested-profile-hadoop-1-2-could-not-be-activated-because-it-does-not--
I am using Apache Hadoop 1.2.1 . I wanted to use Spark Sql with Hive. So I
tried to build Spark like so .
> mvn -Phive,hadoop-1.2 -Dhadoop.version=1.2.1 clean -DskipTests package
But I get the following error.
The requested profile "hadoop-1.2" could not be activated because it does
not ex
As of now my approach is to fetch all data from tables located in different
databases in separate RDD's and then make a union of them and then query on
them together. I want to know whether I can perform a query on it directly
along with creating an RDD. i.e. Instead of creating two RDDs , firing a
I want to be able to perform a query on two tables in different databases. I
want to know whether it can be done. I've heard about union of two RDD's but
here I want to connect to something like different partitions of a table.
Any help is appreciated
import java.io.Serializable;
//import org.ju
So far I have tried this and I am able to compile it successfully . There
isn't enough documentation on spark for its usage with databases. I am using
AbstractFunction0 and AbsctractFunction1 here. I am unable to access the
database. The jar just runs without doing anything when submitted. I want t