I have a Flume which stream data into HDFS sink (appends to same file),
which I could hdfs dfs -cat and see it from HDFS. However, when I run
MapReduce job on the folder that contains appended data, it only picks up
the first batch that was flushed (bacthSize = 100) into HDFS. The rest are
not
Since Hadoop https://hive.apache.org came out, there have been various
commercial and/or open-source attempts to expose some compatibility with SQL
http://drill.apache.org.
I am seeking one which is good for low-latency querying, and supports the
most common CRUD https://spark.apache.org,
Note that there is a difference between being dead and being stale. stale
means avoid as much as possible while dead means avoid absolutely AND
initiate a recovery, i.e. copy all the data (typically 1 or more Tb)
There is some info on this blog entry:
Are you running NTP?
On Friday, January 23, 2015 12:42 AM, Fabio anyte...@gmail.com wrote:
Hi guys,
while analyzing SLS logs I noticed some unexpected behaviors, such as
resources requests sent before the AM container gets to a RUNNING state.
For this reason I started wondering how
I also have complicated clients that need help with this. I would like to help
also.
Date: Mon, 26 Jan 2015 18:49:42 +
Subject: Re: Hadoop Security Community
From: ranadi...@gmail.com
To: user@hadoop.apache.org
Hi Adam,
I am interested in collaborating on this. I am working for a
Hi Matt!
Take a look at the mapreduce.jobhistory.* configuration parameters here for the
delay in moving finished jobs to the
HistoryServer:https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml
I've seen this error hadoop is not allowed
Hi Dave!
Here the class which is used to store all the edits :
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java#L575
HTHRavi
On Monday, January 26, 2015 10:32 AM, dlmar...@comcast.net
If multiple directories are specified for dfs.namenode.name.dir and
dfs.namenode.edits.dir, are the writes to the different directories done in
parallel or serial?
Does dfs.namenode.shared.edits.dir support multiple directories like the
properties above?
Thanks,
Dave
All:
The Center for Internet Security (CIS) has established a Community focused on
defining a configuration benchmark for Hadoop. We are in the early stages of
benchmark development, and hope that you will consider joining the effort.
Over the course of the next several days a draft
Dear Adam,
I am interested in collaborating on this. I work with Cloudera and teach Hadoop
courses, such as the Administrator course. I learn about security
implementation and think a common benchmark would be great for the community.
What are the requirements for contributions? I volunteer
If multiple directories are specified for dfs.namenode.name.dir and
dfs.namenode.edits.dir, are the writes to the different directories done in
parallel or serial?
Does dfs.namenode.shared.edits.dir support multiple directories like the
properties above?
Thanks,
Dave
Hi Adam,
I am interested in collaborating on this. I am working for a large
financial institution at the moment and security is a bit pain in the neck
at the moment. So, this is a major focus area for me at the moment.
Regards,
Ranadip
On 26 January 2015 at 18:32, mirko.kaempf
Yes I am, does it make a difference? SLS runs on a single machine,
wrapping the RM and simulating the nodes, thus it should use just the
system time.
Or do you mean there is a chance it's updating the clock while the job
is running?
Regards
Fabio
On 01/26/2015 08:00 PM, Ravi Prakash wrote:
Hi Frank,
can you file an issue to add this configuration to the hdfs-default.xml?
On Mon, Jan 26, 2015 at 5:39 PM, Frank Lanitz frank.lan...@sql-ag.de
wrote:
Hi,
Am 23.01.2015 um 19:23 schrieb Chris Nauroth:
The time period for determining if a datanode is dead is calculated as a
Hi everyone,
We have set up and been playing with Hadoop 1.2.x and its friends
(Hbase, pig, hive etc.) on 7 physical servers. We want to test Hadoop
(maybe different versions) and ecosystem on physical machines
(virtualization is not an option) from different perspectives.
As a bunch of
Hi,
I think the best way is deploy HDFS federation with Hadoop 2.x.
On Mon, Jan 26, 2015 at 5:18 PM, Harun Reşit Zafer
harun.za...@tubitak.gov.tr wrote:
Hi everyone,
We have set up and been playing with Hadoop 1.2.x and its friends (Hbase,
pig, hive etc.) on 7 physical servers. We want to
Hi,
Am 23.01.2015 um 19:23 schrieb Chris Nauroth:
The time period for determining if a datanode is dead is calculated as a
function of a few different configuration properties. The current
implementation in DatanodeManager.java does it like this:
final long heartbeatIntervalSeconds =
17 matches
Mail list logo