Hello,
I have following services are configured and installed successfully:
Hadoop 2.7.x
Spark 2.0.x
HBase 1.2.4
Hive 1.2.1
*Installation Directories:*
/usr/local/hadoop
/usr/local/spark
/usr/local/hbase
*Hive Environment variables:*
#HIVE VARIABLES START
export HIVE_HOME=/usr/local/hive
expo
Hello Spark Folks,
Other weird experience i have with Spark with SqlContext is when i created
Dataframe sometime this error throws exception and sometime not !
scala> import sqlContext.implicits._
import sqlContext.implicits._
scala> val stdDf = sqlContext.createDataFrame(rowRDD,empSchema.struct
Usually this kind of thing can be done at a lower level in the InputFormat
usually by specifying the max split size. Have you looked into that
possibility with your InputFormat?
On Sun, Jan 15, 2017 at 9:42 PM, Fei Hu wrote:
> Hi Jasbir,
>
> Yes, you are right. Do you have any idea about my ques
On 16 Jan 2017, at 11:06, Hyukjin Kwon
mailto:gurwls...@gmail.com>> wrote:
Hi,
I just looked through Jacek's page and I believe that is the correct way.
That seems to be a Hadoop library specific issue[1]. Up to my knowledge,
winutils and the binaries in the private repo
are built by a Hadoo
On 16 Jan 2017, at 12:51, Rostyslav Sotnychenko
mailto:r.sotnyche...@gmail.com>> wrote:
Thanks all!
I was using another DFS instead of HDFS, which was logging an error when
fs.delete got called on non-existing path.
really? Whose DFS, if you don't mind me asking? I'm surprised they logged th
On 16 Jan 2017, at 10:35, assaf.mendelson
mailto:assaf.mendel...@rsa.com>> wrote:
Hi,
In the documentation it says spark is supported on windows.
The problem, however, is that the documentation description on windows is
lacking. There are sources (such as
https://jaceklaskowski.gitbooks.io/mas
Hi Pradeep,
That is a good idea. My customized RDDs are similar to the NewHadoopRDD. If
we have billions of InputSplit, will it be bottlenecked for the
performance? That is, will too many data need to be transferred from master
node to computing nodes by networking?
Thanks,
Fei
On Mon, Jan 16, 2
Hello Community,
I am struggling to save Dataframe to Hive Table,
Versions:
Hive 1.2.1
Spark 2.0.1
*Working code:*
/*
@Author: Chetan Khatri
/* @Author: Chetan Khatri Description: This Scala script has written for
HBase to Hive module, which reads table from HBase and dump it out to Hive
*/ im
Cool, thanks!
Jira: https://issues.apache.org/jira/browse/SPARK-19247
PR: https://github.com/apache/spark/pull/16607
I think the LDA model has the exact same issues - currently the
`topicsMatrix` (which is on order of numWords*k, 4GB for numWords=3m and
k=1000) is saved as a single element in a c
Hi Liang-Chi,
Yes, the logic split is needed in compute(). The preferred locations can be
derived from the customized Partition class.
Thanks for your help!
Cheers,
Fei
On Mon, Jan 16, 2017 at 3:00 AM, Liang-Chi Hsieh wrote:
>
> Hi Fei,
>
> I think it should work. But you may need to add few
Thanks all!
I was using another DFS instead of HDFS, which was logging an error when
fs.delete got called on non-existing path.
In Spark 2.0.1 which I was using previously, everything was working fine
because existence of an additional check that was made prior to deleting.
However that check got
Hi,
I just looked through Jacek's page and I believe that is the correct way.
That seems to be a Hadoop library specific issue[1]. Up to my
knowledge, winutils and the binaries in the private repo
are built by a Hadoop PMC member on a dedicated Windows VM which I believe
are pretty trustable.
Th
Hi,
In the documentation it says spark is supported on windows.
The problem, however, is that the documentation description on windows is
lacking. There are sources (such as
https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-tips-and-tricks-running-spark-windows.html
and man
Hi Fei,
I think it should work. But you may need to add few logic in compute() to
decide which half of the parent partition is needed to output. And you need
to get the correct preferred locations for the partitions sharing the same
parent partition.
Fei Hu wrote
> Hi Liang-Chi,
>
> Yes, you a
14 matches
Mail list logo