from:"Alex Baranau"

[OFFTOPIC] Big Data Application Meetup

2015-06-02 Thread Alex Baranau

at Hadoop Summit and Spark Summit in the following weeks. Thank you, Alex Baranau

Re: Bug in LocalJobRunner?

2013-03-22 Thread Alex Baranau

Hi Harsh J, Thanx for taking a look. I created https://issues.apache.org/jira/browse/MAPREDUCE-5097 and attached patch. I also provided (ugly, sorry) example of how to get the error. Alex Baranau On Thu, Mar 21, 2013 at 5:58 AM, Harsh J ha...@cloudera.com wrote: Hi Alex, This seems to make

Bug in LocalJobRunner?

2013-03-20 Thread Alex Baranau

); this.job.setClassLoader(classLoader); } I.e. we need to set classloader for job configuration so that it can load classes from the jar. If the above makes sense I will file JIRA with patch, otherwise, what am I missing? Thank you, Alex Baranau

Re: Number of concurrent writer to HDFS

2012-08-06 Thread Alex Baranau

you in advance, Alex Baranau -- Sematext :: http://sematext.com/ :: Hadoop - HBase - ElasticSearch - Solr On Mon, Aug 6, 2012 at 2:14 AM, Yanbo Liang yanboha...@gmail.com wrote: You can use scribe or flume to collect log data and integrated with hadoop. 2012/8/4 Nguyen Manh Tien

Bulk Import Data Locality

2012-07-18 Thread Alex Baranau

tasks, this would help us. I believe this is not possible with MR1, please correct me if I'm wrong. Perhaps, this is this possible with MR2? I assume there's no way to provide a hint to a NameNode where to place blocks of a new File too, right? Thank you, -- Alex Baranau -- Sematext :: http

Fwd: Bulk Import Data Locality

2012-07-18 Thread Alex Baranau

to preserve data locality if RS fails down (or when anything else cause re-assigning the region). But since Region size is usually much bigger (usually 10-20 times bigger at least), this fact doesn't buy you something. Alex Baranau -- Sematext :: http://blog.sematext.com/ :: Hadoop - HBase

Fwd: Bulk Import Data Locality

2012-07-18 Thread Alex Baranau

to preserve data locality if RS fails down (or when anything else cause re-assigning the region). But since Region size is usually much bigger (usually 10-20 times bigger at least), this fact doesn't buy you something. Alex Baranau -- Sematext :: http://blog.sematext.com/ :: Hadoop - HBase

hadoop fs -du hbase table size

2011-03-14 Thread Alex Baranau

you, Alex Baranau Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase

Making input in Map iterable

2010-12-08 Thread Alex Baranau

(unit-tests work well at least) state. Thank you in advance! Alex Baranau Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase

Re: program running faster on single node than cluster

2010-11-17 Thread Alex Baranau

How many nodes do you use for you fully distributed cluster? Alex Baranau Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase On Wed, Nov 17, 2010 at 5:44 AM, Cornelio Iñigo cornelio.ini...@gmail.comwrote: Hi I have a question to you: I developed a program using

Re: repeat a job for different files

2010-11-17 Thread Alex Baranau

In case you need to process the files separately, use one MR job for each file. You can add a single file as input. I believe you'll need to iterate over all files in input dir and start job instance for each file. You can do this in java code or in script or... depending on your case. Alex

Re: program running faster on single node than cluster

2010-11-17 Thread Alex Baranau

many map and reduce tasks started for you job and how many nodes are used to process the job. Alex Baranau Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase On Thu, Nov 18, 2010 at 8:19 AM, Cornelio Iñigo cornelio.ini...@gmail.comwrote: Hi the cluster has 12 nodes

Re: wrong value class error

2010-11-16 Thread Alex Baranau

The message refers to the value not being an IntWritable, which is an *input* value type of your reducer (and the output value type of your mapper). Looks like you have a problem with mapper, not reducer. Alex Baranau Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase

Re: JobConf

2010-11-14 Thread Alex Baranau

You might find this search tool valuable: http://search-hadoop.com. You can do search in sources and javadocs separately. Alex Baranau Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - HBase On Sun, Nov 14, 2010 at 10:28 PM, maha m...@umail.ucsb.edu wrote: Never mind Jeff

HBase MR: run more map tasks than regions

2010-09-14 Thread Alex Baranau

) of map tasks in this situation? Is the only way is to enhance TableInputFormat for me? Thank you, Alex Baranau --- http://sematext.com

Re: Client access

2010-09-07 Thread Alex Baranau

for aggregating log data streamed in real time from a large number of servers). (see http://blog.sematext.com/2010/08/02/hadoop-digest-july-2010/ with better formatting and links ;)) Alex Baranau Sematext :: http://sematext.com/ :: Solr - Lucene - Nautch - Hadoop - HBase Hadoop ecosystem search

Re: Research projects with Hadoop

2010-09-07 Thread Alex Baranau

Hi Luan, That's not a new question on these mailing lists, so I'd suggest to start digging into links at http://search-hadoop.com/?q=research+project+ideaspage. Hadoop-related projects are relatively young and full of ideas, good luck with finding your spot! Alex Baranau Sematext :: http

Re: Research projects with Hadoop

2010-09-07 Thread Alex Baranau

Sorry, looks like the link I provided got corrupted, the original was: http://search-hadoop.com/?q=research+project+ideas Alex Baranau Sematext :: http://sematext.com/ :: Solr - Lucene - Nautch - Hadoop - HBase Hadoop ecosystem search :: http://search-hadoop.com/ On Tue, Sep 7, 2010 at 10

Re: Classpath

2010-09-01 Thread Alex Baranau

From http://blog.sematext.com/2010/05/31/hadoop-digest-may-2010/ FAQ section: How can I attach external libraries (jars) which my jobs depend on? You can put them in a “lib” subdirectory of your jar root directory. Alternatively you can use DistributedCache API. Alex Baranau Sematext

Re: missing part folder - how to debug?

2010-09-01 Thread Alex Baranau

Hi, Adding Solr user list. We used similar approach to the one in this patch but with Hadoop Streaming. Did you determine that indices are really missing? I mean did you find missing documents in the output indices? Alex Baranau Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch

Searching more ZooKeeper content

2010-08-25 Thread Alex Baranau

by default? We look into adding this search service for all Hadoop's sub-projects. Assuming people are for this, any suggestions for how the search should function by default or any specific instructions for how the search box should be modified would be great! Thank you, Alex Baranau. P.S. HBase

Searching more MapReduce content

2010-08-25 Thread Alex Baranau

by default? We look into adding this search service for all Hadoop's sub-projects. Assuming people are for this, any suggestions for how the search should function by default or any specific instructions for how the search box should be modified would be great! Thank you, Alex Baranau. P.S. HBase

Searching more Hadoop-Common content

2010-08-25 Thread Alex Baranau

project by default? We look into adding this search service for all Hadoop's sub-projects. Assuming people are for this, any suggestions for how the search should function by default or any specific instructions for how the search box should be modified would be great! Thank you, Alex Baranau. P.S

Searching more HDFS content

2010-08-25 Thread Alex Baranau

? We look into adding this search service for all Hadoop's sub-projects. Assuming people are for this, any suggestions for how the search should function by default or any specific instructions for how the search box should be modified would be great! Thank you, Alex Baranau. P.S. HBase community

Re: FAQ for New to Hadoop

2010-07-11 Thread Alex Baranau

posts on http://blog.sematext.com as well. Alex Baranau Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase Hadoop ecosystem search :: http://search-hadoop.com/ On Fri, Jul 9, 2010 at 1:35 AM, Mark Kerzner markkerz...@gmail.com wrote: Cool, Ken, thank you, I think

[OFFTOPIC] Big Data Application Meetup

Re: Bug in LocalJobRunner?

Bug in LocalJobRunner?

Re: Number of concurrent writer to HDFS

Bulk Import Data Locality

Fwd: Bulk Import Data Locality

Fwd: Bulk Import Data Locality

hadoop fs -du hbase table size

Making input in Map iterable

Re: program running faster on single node than cluster

Re: repeat a job for different files

Re: program running faster on single node than cluster

Re: wrong value class error

Re: JobConf

HBase MR: run more map tasks than regions

Re: Client access

Re: Research projects with Hadoop

Re: Research projects with Hadoop

Re: Classpath

Re: missing part folder - how to debug?

Searching more ZooKeeper content

Searching more MapReduce content

Searching more Hadoop-Common content

Searching more HDFS content

Re: FAQ for New to Hadoop

25 matches

Site Navigation

Mail list logo

Footer information