Re: changes to compression interfaces in 0.15?

2008-02-21 Thread Pete Wyckoff
If the API semantics are changing under you, you have to change your code whether or not the API is pulled or deprecated. Pulling it makes it more obvious that the user has to change his/her code. -- pete On 2/21/08 12:41 PM, Arun C Murthy [EMAIL PROTECTED] wrote: On Feb 21, 2008, at

Re: How to compile fuse-dfs

2008-03-10 Thread Pete Wyckoff
Hi Xavier, If you run ./bootsrap.sh does it not create a Makefile for you? There is a bug in the Makefile that hardcodes it to amd64. I will look at this. What kernel are you using and what HW? --pete On 3/10/08 2:23 PM, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi everybody, I'm

Re: Does Hadoop Honor Reserved Space?

2008-03-10 Thread Pete Wyckoff
https://issues.apache.org/jira/browse/HADOOP-2991 -Original Message- From: Joydeep Sen Sarma [mailto:[EMAIL PROTECTED] Sent: Monday, March 10, 2008 12:56 PM To: core-user@hadoop.apache.org; core-user@hadoop.apache.org Cc: Pete Wyckoff Subject: RE: Does Hadoop Honor Reserved Space

Re: HDFS interface

2008-03-11 Thread Pete Wyckoff
Hi Naama, This JIRA is tracking both the fuse and webdav efforts: https://issues.apache.org/jira/browse/HADOOP-4 -- pete On 3/10/08 11:17 PM, Naama Kraus [EMAIL PROTECTED] wrote: Hi, I'd be interested in information about interfaces to HDFS other then the DFSShell commands. I've seen

Re: How to compile fuse-dfs

2008-03-11 Thread Pete Wyckoff
But, to be clear, you can do mv, rm, mkdir, rmdir. On 3/11/08 10:24 AM, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Thanks Pete. I'll be waiting for 0.17 then

Experience with Hadoop on Open Solaris

2008-03-28 Thread Pete Wyckoff
Anyone have experience running a production cluster on Open Solaris? The advantage of course is the availability of ZFS, but I haven't seen much in the way of people on the list mentioning they use Open Solaris. Thanks, pete

Re: 答复: Problem with key aggregation when number of reduce tasks is more than 1

2008-04-11 Thread Pete Wyckoff
Yes and as such, we've found better load balancing when the #of reduces is a prime #. Although the string.hashCode isn't great for short strings. On 4/11/08 4:16 AM, Zhang, jian [EMAIL PROTECTED] wrote: Hi, Please read this, you need to implement partitioner. It controls which key is

Re: Fuse-j-hadoopfs

2008-07-23 Thread Pete Wyckoff
Hi Xavier, RE: fuse dfs having facebook specific things in it, I think that the trunk version should be pretty clean. As far as permissions in fuse dfs, the following 2 jiras relate to that, and Craig Macdonald is working on it. https://issues.apache.org/jira/browse/HADOOP-3765

Re: fuse-dfs

2008-08-06 Thread Pete Wyckoff
Hi Sebastian, The problem is that hdfs.so is supposed to be in build/libhdfs but for some reason isn't. Have you tried doing a ant compile-libhdfs -Dlibhdfs=1 ? And then checked if hdfs.so is in build/libhdfs ? Thanks, pete On 8/6/08 5:04 AM, Sebastian Vieira [EMAIL PROTECTED] wrote: Hi,

Re: fuse-dfs

2008-08-06 Thread Pete Wyckoff
Sorry - I see the problem now: should be: Ant compile-contrib -Dlibhdfs=1 -Dfusedfs=1 Compile-contrib depends on compile-libhdfs which also requires the -Dlibhdfs=1 property to be set. pete On 8/6/08 5:04 AM, Sebastian Vieira [EMAIL PROTECTED] wrote: Hi, I have installed Hadoop on 20

Re: hdfs question

2008-08-07 Thread Pete Wyckoff
One way to get all Unix commands to work as is is to mount hdfs as a normal unix filesystem with either fuse-dfs (in contrib) or hdfs-fuse (on google code). Pete On 8/6/08 5:08 PM, Mori Bellamy [EMAIL PROTECTED] wrote: hey all, often i find it would be convenient for me to run conventional

Re: fuse-dfs

2008-08-07 Thread Pete Wyckoff
the fuse_dfs_wrapper.sh, you should be able to set HADOOP_HOME and it will create the classpath for you. In retrospect, fuse_dfs_wrapper.sh should probably complain and exit if HADOOP_HOME is not set. -- pete On 8/7/08 2:35 PM, Sebastian Vieira [EMAIL PROTECTED] wrote: On Thu, Aug 7, 2008 at 4:25 PM, Pete

Re: fuse-dfs

2008-08-08 Thread Pete Wyckoff
Hi Sebastian. Setting of times doesn¹t work, but ls, rm, rmdir, mkdir, cp, etc etc should work. Things that are not currently supported include: Touch, chown, chmod, permissions in general and obviously random writes for which you would get an IO error. This is what I get on 0.17 for df ­h:

Re: Thinking about retriving DFS metadata from datanodes!!!

2008-09-09 Thread Pete Wyckoff
+1 - from the perspective of the data nodes, dfs is just a block-level store and is thus much more robust and scalable. On 9/9/08 9:14 AM, Owen O'Malley [EMAIL PROTECTED] wrote: This isn't a very stable direction. You really don't want multiple distinct methods for storing the metadata,

Re: Thinking about retriving DFS metadata from datanodes!!!

2008-09-11 Thread Pete Wyckoff
suggestion is appreciate! 2008/9/10 Pete Wyckoff [EMAIL PROTECTED] +1 - from the perspective of the data nodes, dfs is just a block-level store and is thus much more robust and scalable. On 9/9/08 9:14 AM, Owen O'Malley [EMAIL PROTECTED] wrote: This isn't a very stable direction. You

aerialization.Deserializer.deserialize method help

2008-09-12 Thread Pete Wyckoff
This method's signature is {code} T deserialize(T); {code} But, the RecordReader next method is {code} boolean next(K,V); {code} So, if the deserialize method does not return the same T (i.e., K or V), how would this new Object be propagated back thru the RecordReader next method. It seems

Re: aerialization.Deserializer.deserialize method help

2008-09-12 Thread Pete Wyckoff
Specifically, line 75 of SequenceFileRecordReader: boolean remaining = (in.next(key) != null); Throws out the return value of SequenceFile.next which is the result of deserialize(obj). -- pete On 9/12/08 2:28 PM, Pete Wyckoff [EMAIL PROTECTED] wrote: What I mean is let's say I plug

Re: aerialization.Deserializer.deserialize method help

2008-09-12 Thread Pete Wyckoff
Sorry - saw the response after I sent this. But the current javadocs are wrong and should probably say must return what was passed in. On 9/12/08 3:02 PM, Pete Wyckoff [EMAIL PROTECTED] wrote: Specifically, line 75 of SequenceFileRecordReader: boolean remaining = (in.next(key) != null

Parameterized deserializers?

2008-09-12 Thread Pete Wyckoff
If I have a generic Serializer/Deserializers that take some runtime information to instantiate, how would this work in the current serializer/deserializer APIs? And depending on this runtime information, may return different Objects although they may all derive from the same class. For example,

Re: Parameterized deserializers?

2008-09-12 Thread Pete Wyckoff
I should mention this is out of the context of SequenceFiles where we get the class names in the file itself. Here there is some information inserted into the JobConf that tells me the class of the records in the input file. -- pete On 9/12/08 3:26 PM, Pete Wyckoff [EMAIL PROTECTED] wrote

Re: LHadoop Server simple Hadoop input and output

2008-10-24 Thread Pete Wyckoff
Chukwa also could be used here. On 10/24/08 11:47 AM, Jeff Hammerbacher [EMAIL PROTECTED] wrote: Hey Edward, The application we used at Facebook to transmit new data is open source now and available at http://sourceforge.net/projects/scribeserver/. Later, Jeff On Fri, Oct 24, 2008 at 10:14

Re: Status FUSE-Support of HDFS

2008-11-03 Thread Pete Wyckoff
PROTECTED] wrote: Hi Pete, thanks for the info. That helps a lot. We will probably test it for our use cases then. Did you benchmark throughput when reading writing files through fuse-dfs and compared it to command line tool or API access? Is there a notable difference? Thanks again, Robert Pete

Re: Status FUSE-Support of HDFS

2008-11-03 Thread Pete Wyckoff
at facebook. -- pete It's good for a portable application to keep the #of files/directory low by having two levels of directory for storing files -just use a hash operation to determine which dir to store a specific file in. On 11/3/08 4:00 AM, Steve Loughran [EMAIL PROTECTED] wrote: Pete Wyckoff wrote

Re: Can anyone recommend me a inter-language data file format?

2008-11-03 Thread Pete Wyckoff
Protocol buffers, thrift? On 11/3/08 4:07 AM, Steve Loughran [EMAIL PROTECTED] wrote: Zhou, Yunqing wrote: embedded database cannot handle large-scale data, not very efficient I have about 1 billion records. these records should be passed through some modules. I mean a data exchange format

Re: Status FUSE-Support of HDFS

2008-11-06 Thread Pete Wyckoff
if it turns out to be the same order of magnitude on our systems. Have you used this with rsync? If so, any known issues with that (reading or writing)? Thanks in advance, Robert Pete Wyckoff wrote: Reads are 20-30% slower Writes are 33% slower before https://issues.apache.org/jira/browse/HADOOP

Re: libhdfs SIGSEGV error

2008-11-06 Thread Pete Wyckoff
Hi Tamas, Have you tried using the supplied hdfs_write executable includes in the distribution? Also, I didn't understand your comment about using hdfsJniHelper.c - that should be used only by hdfs.c itself. Also, what version of hadoop is this? I haven't seen this problem at least in

Re: FUSE writes in 0.18.1

2008-11-07 Thread Pete Wyckoff
You know what - writes do not not work at all in 18.1 - sorry my confusion. - just switch to 18.2 fuse-dfs. It can run against 18.1 dfs. Just remember to compile it with -Dlibhdfs.noperms=1 On 11/7/08 12:20 PM, Brian Karlak [EMAIL PROTECTED] wrote: On Nov 7, 2008, at 12:09 PM, Pete Wyckoff

Re: libhdfs SIGSEGV error

2008-11-07 Thread Pete Wyckoff
it fails? Thank you. Cheers, Tamas --- On Thu, 11/6/08, Pete Wyckoff [EMAIL PROTECTED] wrote: From: Pete Wyckoff [EMAIL PROTECTED] Subject: Re: libhdfs SIGSEGV error To: , [EMAIL PROTECTED]@yahoo.com, core-user@hadoop.apache.org core-user@hadoop.apache.org Date: Thursday, November 6, 2008, 7:20 PM

Can FSDataInputStream.read return 0 bytes and if so, what does that mean?

2008-11-07 Thread Pete Wyckoff
The javadocs says reads up to size bytes. What happens if it returns 0 (presumably an error) or 0 bytes (??) Thanks, pete

Re: Can FSDataInputStream.read return 0 bytes and if so, what does that mean?

2008-11-07 Thread Pete Wyckoff
Just want to ensure 0 iff EOF or the requested #of bytes was 0. On 11/7/08 6:13 PM, Pete Wyckoff [EMAIL PROTECTED] wrote: The javadocs says reads up to size bytes. What happens if it returns 0 (presumably an error) or 0 bytes (??) Thanks, pete

Re: hdfs fuse mount and namenode out of memory conditions.

2008-11-17 Thread Pete Wyckoff
Hi Jason, There's nothing that fuse does that should cause this I don't think. If this is a cat operation, fuse will do this in chunks of 32 K which you can change by mounting with -ordbuffer_size=#bytes. Do you have the stack trace or the NN log while this app is running? Were you doing the

Re: Block placement in HDFS

2008-11-25 Thread Pete Wyckoff
Fyi - Owen is referring to: https://issues.apache.org/jira/browse/HADOOP-2559 On 11/24/08 10:22 PM, Owen O'Malley [EMAIL PROTECTED] wrote: On Nov 24, 2008, at 8:44 PM, Mahadev Konar wrote: Hi Dennis, I don't think that is possible to do. No, it is not possible. The block placement

For those using Hadoop in the social network domain

2008-11-30 Thread Pete Wyckoff
SOCIAL NETWORK SYSTEMS 2009 (SNS-2009) = Second ACM Workshop on Social Network Systems March 31, EuroSys 2009 Nuremberg, Germany