g Spark with HDFS encrypted with KMS :-)
>>
>> Thanks,
>> Michel
>>
>> Le jeu. 13 août 2020 à 14:32, Michel Sumbul a
>> écrit :
>>
>>> Hi guys,
>>>
>>> Does anyone try Spark3 on k8s reading data from HDFS encrypted with KMS
&g
using Spark with HDFS encrypted with KMS :-)
>
> Thanks,
> Michel
>
> Le jeu. 13 août 2020 à 14:32, Michel Sumbul a
> écrit :
>
>> Hi guys,
>>
>> Does anyone try Spark3 on k8s reading data from HDFS encrypted with KMS
>> in HA mode (with kerberos)?
>
:-)
Thanks,
Michel
Le jeu. 13 août 2020 à 14:32, Michel Sumbul a
écrit :
> Hi guys,
>
> Does anyone try Spark3 on k8s reading data from HDFS encrypted with KMS in
> HA mode (with kerberos)?
>
> I have a wordcount job running with Spark3 reading data on HDFS (hadoop
> 3
Hi guys,
Does anyone try Spark3 on k8s reading data from HDFS encrypted with KMS in
HA mode (with kerberos)?
I have a wordcount job running with Spark3 reading data on HDFS (hadoop
3.1) everything secure with kerberos. Everything works fine if the data
folder is not encrypted (spark on k8s
Hello,
I am reading data from HDFS in a Spark application and as far as I read
each HDFS block is 1 partition for Spark by default. Is there any way to
select only 1 block from HDFS to read in my Spark application?
Thank you,
Thodoris
philjj...@gmail.com> wrote:
>
> Hi
>
> I have a few of questions about a structure of HDFS and S3 when Spark-like
> loads data from two storage.
>
> Generally, when Spark loads data from HDFS, HDFS supports data locality and
> already own distributed file on datanodes
ve a few of questions about a structure of HDFS and S3 when Spark-like
> loads data from two storage.
>
>
> Generally, when Spark loads data from HDFS, HDFS supports data locality and
> already own distributed
> file on datanodes, right? Spark could just process data on workers.
>
Hi
I have a few of questions about a structure of HDFS and S3 when Spark-like
loads data from two storage.
Generally, when Spark loads data from HDFS, HDFS supports data locality and
already own distributed file on datanodes, right? Spark could just process
data on workers.
What about S3
I configured HDFS to cache file in HDFS's cache, like following:
hdfs cacheadmin -addPool hibench
hdfs cacheadmin -addDirective -path /HiBench/Kmeans/Input -pool hibench
But I didn't see much performance impacts, no matter how I configure
dfs.datanode.max.locked.memory
Is it possible that
Have you read this thread ?
http://search-hadoop.com/m/uOzYttXZcg1M6oKf2/HDFS+cache=RE+hadoop+hdfs+cache+question+do+client+processes+share+cache+
Cheers
On Mon, Jan 25, 2016 at 1:23 PM, Jia Zou wrote:
> I configured HDFS to cache file in HDFS's cache, like following:
Please see also:
http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html
According to Chris Nauroth, an hdfs committer, it's extremely difficult to
use the feature correctly.
The feature also brings operational complexity. Since off-heap memory is
from table where id = ")
> //filtered data frame
> df.count
>
> On Sat, Jan 2, 2016 at 11:56 AM, SRK <swethakasire...@gmail.com> wrote:
>
>> Hi,
>>
>> How to load partial data from hdfs using Spark SQL? Suppose I want to load
>> data based on a filter like
&
Hi,
How to load partial data from hdfs using Spark SQL? Suppose I want to load
data based on a filter like
"Select * from table where id = " using Spark SQL with DataFrames,
how can that be done? The
idea here is that I do not want to load the whole data into memory when I
use the
Ok, so whats wrong in using :
var df=HiveContext.sql("Select * from table where id = ")
//filtered data frame
df.count
On Sat, Jan 2, 2016 at 11:56 AM, SRK <swethakasire...@gmail.com> wrote:
> Hi,
>
> How to load partial data from hdfs using Spark SQL? Suppose I
I'm just reading data from HDFS through Spark. It throws
*java.lang.ClassCastException:
org.apache.hadoop.io.LongWritable cannot be cast to
org.apache.hadoop.io.BytesWritable* at line no 6. I never used LongWritable
in my code, no idea how the data was in that format.
Note : I'm not using
wrote:
> I'm just reading data from HDFS through Spark. It throws
> *java.lang.ClassCastException:
> org.apache.hadoop.io.LongWritable cannot be cast to
> org.apache.hadoop.io.BytesWritable* at line no 6. I never used
> LongWritable in my code, no idea how the data was in that format.
>
Once you convert your data to a dataframe (look at spark-csv), try
df.write.partitionBy("", "mm").save("...").
On Thu, Oct 1, 2015 at 4:11 PM, haridass saisriram <
haridass.saisri...@gmail.com> wrote:
> Hi,
>
> I am trying to find a simple example to read a data file on HDFS. The
> file
Hi,
I am trying to find a simple example to read a data file on HDFS. The
file has the following format
a , b , c ,,mm
a1,b1,c1,2015,09
a2,b2,c2,2014,08
I would like to read this file and store it in HDFS partitioned by year and
month. Something like this
/path/to/hdfs//mm
I want to
18 matches
Mail list logo