Re: Streaming data - Avaiable tools

2014-07-04 Thread Marcos Ortiz
may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy. __ www.accenture.com -- Marcos Ortiz <http://www.linkedin.com/in/mlortiz

Re: Storing videos in Hdfs

2014-06-17 Thread Marcos Ortiz
andra for Real-Time video analytics. -- Marcos Ortiz[1] (@marcosluis2186[2]) http://about.me/marcosortiz[3] On Tuesday, June 17, 2014 06:12:49 PM alajangikish...@gmail.com wrote: > Hi hadoopers, > > What is the best way to store video files in Hdfs? > > Sent from my iPhone

Re: MapReduce scalability study

2014-05-22 Thread Marcos Ortiz
/MAPREDUCE-278 https://issues.apache.org/jira/browse/MAPREDUCE-279 You should talk with Arun C Murthy, Chief Architect at Hortonworks about all these topics. He could help you much more than I could. -- Marcos Ortiz[1] (@marcosluis2186[2]) http://about.me/marcosortiz[3] > > > Best

Re: Job Tracker Stops as Task Tracker starts

2014-05-20 Thread Marcos Ortiz
What version of JDK are you using in your servers? What version of Hadoop are you using? -- Marcos Ortiz[1] (@marcosluis2186[2]) http://about.me/marcosortiz[3] On Tuesday, May 20, 2014 09:01:07 PM Faisal Rabbani wrote: > Hi, > I just installed jobtracker and task trackers but as soon as I

Re: Which database should be used

2014-05-02 Thread Marcos Ortiz
On Friday, May 02, 2014 04:21:58 PM Alex Lee wrote: > There are many database, such as Hbase, hive and mango etc. I need to choose > one to save data big volumn stream data from sensors. > Will hbase be good, thanks. HBase could be a good allied for this case. You should check OpenTSDB project

Re: Random Exception

2014-05-02 Thread Marcos Ortiz
It seems that your Hadoop data directory is broken or your disk has problems. Which version of Hadoop are you using? On Friday, May 02, 2014 08:43:44 AM S.L wrote: > Hi All, > > I get this exception after af resubmit my failed MapReduce jon, can one > please let me know what this exception means

Re: upgrade to CDH5 from CDH4.6 hadoop 2.0

2014-04-28 Thread Marcos Ortiz
Regards, Motty This kind of questions, I think that should be asked in the CDH Users mailing list. There, you will obtain a better and a faster answer. Best wishes On Monday, April 28, 2014 01:00:13 PM motty cruz wrote: > Hello, I'm upgrading to CDH5. I download latest parcel from > http://archiv

Re: Intel Hadoop Distribution.

2013-03-01 Thread Marcos Ortiz
jw-02-2013/130227-intel-releases-hadoop-distribution.html I was wondering on how is their distribution different than other players? or why would anyone buy intel's distribution at all? (Probably not suited for this mailing list, then please let me know? ) Thanks -- Marcos Ortiz Valmased

Re: Hadoop 2.0.3 namenode issue

2013-02-15 Thread Marcos Ortiz Valmaseda
5:03,391 WARN ipc.Server - Incorrect header or version mismatch from 10.232.29.4:40031 got version 4 expected version 7 2013-02-13 12:16:33,181 INFO namenode.FSNamesystem - Roll Edit Log from 10.232.29.14 == ====== -- Marcos Ortiz Valmaseda, Product Ma

Re: .deflate trouble

2013-02-15 Thread Marcos Ortiz Valmaseda
p installation instead of an EMR-based installation. But I might contact them anyway to see what they recommend. Thanks for he refs. On Feb 14, 2013, at 19:09 , Marcos Ortiz Valmaseda wrote: > Regards, Keith. For EMR issues and stuff, you can contact directly to Jeff > Barr(Chief Evangelist fo

Re: .deflate trouble

2013-02-14 Thread Marcos Ortiz Valmaseda
to suspect that my own is also." -- Mark Twain ____ -- Marcos Ortiz Valmaseda, Product Manager && Data Scientist at UCI Blog : http://marcosluis2186.posterous.com LinkedIn: http://www.linkedin.com/in/marcosluis2186 Twitter : @marcosluis2186

Re: Mutiple dfs.data.dir vs RAID0

2013-02-10 Thread Marcos Ortiz
erience/advice/results to share? Thanks, JM -- Marcos Ortiz Valmaseda, Product Manager && Data Scientist at UCI Blog: http://marcosluis2186.posterous.com Twitter: @marcosluis2186 <http://twitter.com/marcosluis2186>

Re: Hadoop-Yarn-MR reading InputSplits and processing them by the RecordReader, architecture/design question.

2013-02-04 Thread Marcos Ortiz
In the FileSystem.getFileBlockLocations() the hostname is hard-coded as "localhost", where this is mapped to the actual host name, so that AM will know which nodes to request? Thanks for reply -- +Vinod Hortonworks Inc. http://hortonworks.

Re: Hadoop Tutorial help

2012-12-09 Thread Marcos Ortiz Valmaseda
Hi, Jennifer. Precisely, Robert Evans, from Yahoo! Team was working in the update of this tutorial to use at least Hadoop 1.x series, but I don´t know right now the progress of the project. OK, now, you don´t need to download Hadoop-0.18.0 because, it´s included in the VMware Hadoop VM. You can

Re: Strange machine behavior

2012-12-08 Thread Marcos Ortiz
Are you sure that 24 map slots is a good number for this machine? Remember that you have three services (DN, TT and HRegionServer) with with a 12 GB for Heap. Try to use a lower number of map slots (12 for example) and launch your MR job again. Can you share your logs in pastebin? On Sat 08 Dec

Re: Reg: No space left on device Exception

2012-12-06 Thread Marcos Ortiz
It seems that you don´t have enough space in your TaskTracker/DataNode node. Did you check available space on your dedicated hard drives to host your data in your TT/DN machine? On Fri 07 Dec 2012 12:38:27 AM CST, Manoj Babu wrote: Hi All, I am getting the exception as below but the job conti

Re: Hadoop V/S Cassandra

2012-12-06 Thread Marcos Ortiz
On 12/06/2012 11:55 AM, yogesh dhari wrote: Hi all, Hadoop have different file system(HDFS) and Cassandra have different file system(CFS). As Hadoop have great Eco-System (Hive{Dataware House}, Hbase{Data Base} n etc..) and Cassandra(Database) it self providing its own file system Althou

Re: HADOOP UPGRADE ERROR

2012-11-22 Thread Marcos Ortiz
On 11/22/2012 08:55 PM, yogesh dhari wrote: Hi All, I am trying to upgrade hadoop-0.20.2 to hadoop-1.0.4. I used command *hadoop namenode -upgrade* after that if I start cluster by command *Start-all.sh the TT and DN doesn't starts.* Which steps did you follow to perform the upgrade proces

Re: hadoop - running examples

2012-11-08 Thread Marcos Ortiz Valmaseda
Mohammad is right. When you write a file to HDFS, it can´t be modified. The pattern in HDFS is write-one/read-many-times. If you want to use a distribution where you can read and write files, you should take a look to MapR distribution. - Mensaje original - De: Mohammad Tariq Para: user

Re: monitoring CPU cores (resource consumption) in hadoop

2012-11-03 Thread Marcos Ortiz
Regards, Jim. In the open source world I don't know. In the Enterprise world, Boundary is a great choice. Look here: http://boundary.com/why-boundary/product/ On 11/03/2012 02:59 PM, ugiwgh wrote: The Paramon can resove this problem. It can monitoring CPU cores. --GHui -- Origi

Re: Set the number of maps

2012-11-01 Thread Marcos Ortiz
The option since 0.21 was renamed to mapreduce.tasktracker.map.tasks.maximum, and like Harsh said to you, is is a TaskTracker service level option. Another thing is that this option is very united to the mapreduce.child.java.opts, so , make sure to monitor constantly the effect of these change

Re: Insight on why distcp becomes slower when adding nodemanager

2012-10-31 Thread Marcos Ortiz
On 10/31/2012 02:23 PM, Michael Segel wrote: Not sure. Lots of things can effect your throughput. Networking is my first guess. Which is why I asked about the number of times you've run the same test to see if there is a wide variation in timings. On Oct 31, 2012, at 7:37 AM, Alexandre Fouc

Re: File Permissions on s3 FileSystem

2012-10-23 Thread Marcos Ortiz
El 23/10/12 13:32, Parth Savani escribió: Hello Everyone, I am trying to run a hadoop job with s3n as my filesystem. I changed the following properties in my hdfs-site.xml fs.default.name =s3n://KEY:VALUE@bucket/ A good practice to this is to use these two propert

Re: Java heap space error

2012-10-21 Thread Marcos Ortiz Valmaseda
Regards, Subash. Can you share more information about your YARN cluster? - Mensaje original - De: Subash D'Souza Para: user@hadoop.apache.org Enviado: Sun, 21 Oct 2012 09:18:43 -0400 (CDT) Asunto: Java heap space error I'm running CDH 4 on a 4 node cluster each with 96 G of RAM. Up unti

Re: hadoop 0.23.3 configurations

2012-10-11 Thread Marcos Ortiz
tht configuratons are entirely different anyone knws how to configure java_home hadoop-env.sh and mapred-site.xml are also not present in etc/hadoop/ folder -- Marcos Ortiz Valmaseda, http://about.me/marcosortiz Twitter: @marcosluis2186

Re: issue with permissions of mapred.system.dir

2012-10-09 Thread Marcos Ortiz
On 10/09/2012 07:44 PM, Goldstone, Robin J. wrote: I am bringing up a Hadoop cluster for the first time (but am an experienced sysadmin with lots of cluster experience) and running into an issue with permissions on mapred.system.dir. It has generally been a chore to figure out all the various

Re: sqoop jobs

2012-10-05 Thread Marcos Ortiz
Which version of Sqoop are you using? Which version of Hadoop? On 10/05/2012 09:45 AM, Kartashov, Andy wrote: Guys, Have any one successfully executed commands like Sqoop job --list Sqoop job --create .. etc. Do I need to set-up my sqoop-core.xml before hand? Example.. sqoop job --list

Re: Hadoop Archives under 0.23

2012-10-02 Thread Marcos Ortiz
DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS... CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION http://www.uci.cu http://www.facebook.com/universidad.uci http://www.flickr.com/photos/universidad_uci -- Marcos Ortiz Valmaseda, Data Engineer && Senior System Administrator at UCI

Re: How to run multiple jobs at the same time?

2012-09-23 Thread Marcos Ortiz
Apache Mahout was built for that Look here: https://cwiki.apache.org/confluence/display/MAHOUT/K-Means+Clustering If you don't want to use the Mahout's approach (highly recommended), you can use the MultipleInput class for that: http://hadoop.apache.org/common/docs/current/api/org/apache/hado

Re: Suggestions required for learning Hadoop

2012-09-13 Thread Marcos Ortiz
Regards, Munnavar. There is a great refcardz from DZone, written by Eugene Ciurana (http://eugeneciurana.com), which are perfect for Sysadmins interesting on Hadoop called: - "Getting Started with Hadoop" - "Deploying Hadoop" http://refcardz.dzone.com If you want to know more, there are a lot

Re: Exception while running a Hadoop example on a standalone install on Windows 7

2012-09-04 Thread Marcos Ortiz
On 09/04/2012 02:35 PM, Udayini Pendyala wrote: Hi Bejoy, Thanks for your response. I first started to install on Ubuntu Linux and ran into a bunch of problems. So, I wanted to back off a bit and try something simple first. Hence, my attempt to install on my Windows 7 Laptop. Well, if you

Re: Hadoop and MainFrame integration

2012-08-28 Thread Marcos Ortiz
The problem with it, is that Hadoop depends on top of HDFS to storage in blocks of 64/128 MB of size (or the size that you determine, 64 MB is the de-facto size), and then make the calculations. So, you need to move all your data to a HDFS cluster to use data in MapReduce jobs if you want to mak

Re: distcp error.

2012-08-28 Thread Marcos Ortiz
Hi, Tao. This problem is only with 2.0.1 or with the two versions? Have you tried to use distcp from 1.0.3 to 1.0.3? El 28/08/2012 11:36, Tao escribió: > > Hi, all > > I use distcp copying data from hadoop1.0.3 to hadoop 2.0.1. > > When the file path(or file name) contain Chinese character, an > e

Re: Hadoop or HBase

2012-08-28 Thread Marcos Ortiz
Regards to all the list. Well, you should ask to the Tumblr´s fellows that they use a combination of MySQL and HBase for its blogging platform. They talked about this topic in the last HBaseCon. Here is the link: http://www.hbasecon.com/sessions/growing-your-inbox-hbase-at-tumblr/ Blake Mathen