Re: 14.1 to 14.2

2007-10-12 Thread Michael G. Noll
When I tried to upgrade to 0.14.2, I ran into a different error (see snippet below). Trying to use the new Hadoop version with Java 1.5.x fails for me, so I had to switch to Java 1.6.x. I was quite surprised because IMHO this is a rather big new requirement - if it's intended - so I had expected

How to deploy hadoop webdav?

2007-10-12 Thread 贺齐
Hi, I am quite a roobie to webdav. Could you give me some example to deploy hadoop webdav? Thank you! Regards

Re: HBase performance

2007-10-12 Thread Jeff Hammerbacher
hmm, i'm going to have to disagree strongly with jim here on several points: 1) the paper you reference has nothing to do with column-store performance: it's all about a new, in-memory oltp system being worked on in stonebraker's lab/vertica. it's mainly about removing disk access via

Re: jdk6 on darwin

2007-10-12 Thread Doug Cutting
Michael Bieniosek wrote: Does anybody know if there is a jdk6 available for Mac? I checked the apple developer site, and there doesn't seem to be one available, despite blogs from last year claiming apple was distributing it. Since I do my development work on a Mac, switching to jdk6 would

Re: jdk6 on darwin

2007-10-12 Thread Colin Evans
Apple used to have a beta 1.6 JDK available, but it looks like it was pulled from their developer site recently. I tried using the beta a while back, and found that some apps wouldn't work with it, so it might not be a good solution anyways. I'm a bit confused by this discussion though.

Re: jdk6 on darwin (was: 14.1 to 14.2)

2007-10-12 Thread Bob Futrelle
Wait a few weeks. jdk6 (jse 6) should be in the Leopard Mac OS X release, c. 26 Oct. - rpf On 10/12/07, Michael Bieniosek [EMAIL PROTECTED] wrote: Does anybody know if there is a jdk6 available for Mac? I checked the apple developer site, and there doesn't seem to be one available, despite

Re: How to get input path in map method

2007-10-12 Thread Ted Dunning
It is also pretty easy to over-ride bits of TextInputFormat to give the file as the key instead of the offset. On 10/12/07 10:19 AM, Benjamin Reed [EMAIL PROTECTED] wrote: We do this in Pig by using our own InputSplits. ben On Friday 12 October 2007, Owen O'Malley wrote: On Oct 12,

Re: jdk6 on darwin

2007-10-12 Thread Doug Cutting
Colin Evans wrote: I'm a bit confused by this discussion though. How would compiling the jars with Java 1.5 and running on 1.6 degrade performance (assuming that the jars don't use any new 1.6 APIs)? It won't. The claim is just that running with Java 1.5 degrades performance significantly.

Re: jdk6 on darwin (was: 14.1 to 14.2)

2007-10-12 Thread Torsten Curdt
On 12.10.2007, at 19:34, Michael Bieniosek wrote: Does anybody know if there is a jdk6 available for Mac? I checked the apple developer site, and there doesn't seem to be one available, despite blogs from last year claiming apple was distributing it. Since I do my development work on a

Re: HBase performance

2007-10-12 Thread Jonathan Hendler
One of the valid points Stonebraker makes, I think, has to do with compression (and null values). For example - does HBase also offer tools, or a strategy for compression? Maybe it's comparing apples to [whatever]. Since Vertica is also a distributed database, I think it may be interesting to

Re: HBase performance

2007-10-12 Thread Doug Cutting
Jonathan Hendler wrote: Since Vertica is also a distributed database, I think it may be interesting to the newbies like myself on the list. To keep the conversation topical - while it's true there's a major campaign of PR around Vertica, I'd be interested in hearing more about how HBase

jdk6 on darwin (was: 14.1 to 14.2)

2007-10-12 Thread Michael Bieniosek
Does anybody know if there is a jdk6 available for Mac? I checked the apple developer site, and there doesn't seem to be one available, despite blogs from last year claiming apple was distributing it. Since I do my development work on a Mac, switching to jdk6 would be very difficult for me if

Re: coding question: user's global variables

2007-10-12 Thread Benjamin Reed
You could put the variables in ZooKeeper and then they would be shared :) ben On Friday 12 October 2007, Owen O'Malley wrote: On Oct 11, 2007, at 9:54 PM, James Yu wrote: I put all user global variables in a class I called MyGlobals. Since map/reduce is distributed in general, you should be

How to get input path in map method

2007-10-12 Thread Shailendra Mudgal
Hi All, I am adding two input dir in a job. Both the input dirs have same Key.class, Value.class. Inside the map method i want to know that which pairkey, value has come from which input dir. How can i do this ? Any help will be appreciated.. Regards, Shaile

Re: coding question: user's global variables

2007-10-12 Thread Dennis Kubes
You can also use a MapRunnable implementation but that would allow global only to each Map task. Dennis Kubes James Yu wrote: For example: I put all user global variables in a class I called MyGlobals public class MyGlobals { static public int var1; ... } Then, in whatever map

Re: I have a new cluster (Xserve + 4 Mac Minis) How to Hadoop?

2007-10-12 Thread Ross Boucher
I can tell you from experience that Hadoop does run fine under OS X, and would second the recommendation not to bother with Linux. The process of setting up the cluster is also just as simple under OS X. Partitioning the internal drives may not be a bad idea, so you can keep the DFS data

RE: HBase performance

2007-10-12 Thread Jim Kellerman
Stonebraker has a new column oriented store called H-Store. It is also talked about in the paper. And now I'll shut up. I didn't intend to create such a firestorm. --- Jim Kellerman, Senior Engineer; Powerset [EMAIL PROTECTED] -Original Message- From: Doug Cutting [mailto:[EMAIL

Re: How to get input path in map method

2007-10-12 Thread Owen O'Malley
On Oct 12, 2007, at 5:51 AM, Shailendra Mudgal wrote: I am adding two input dir in a job. Both the input dirs have same Key.class, Value.class. Inside the map method i want to know that which pairkey, value has come from which input dir. How can i do this ? Any help will be appreciated..

Re: jdk6 on darwin

2007-10-12 Thread Torsten Curdt
On 12.10.2007, at 20:10, Doug Cutting wrote: Michael Bieniosek wrote: Does anybody know if there is a jdk6 available for Mac? I checked the apple developer site, and there doesn't seem to be one available, despite blogs from last year claiming apple was distributing it. Since I do my

File Paths, Hadoop = 0.15 and Local Jobs

2007-10-12 Thread Dennis Kubes
Just in case this can help somebody else and because I just spent a couple of hours debugging this, thought I would share and insight. This only affects locally running jobs, not the DFS, and should only affect windows users. On windows with hadoop 0.14 and below, you used to be able to do

RE: HBase performance

2007-10-12 Thread Jim Kellerman
One more comment and then I'll really shut up, I promise. On re-reading the paper, you are all absolutely correct about C-Store, H-Store and Vertica. What is not in the paper and part of what he presented this week was applying column oriented stores to the TPC-H benchmark. The TPC-H OLTP

Re: HBase performance

2007-10-12 Thread Peter W.
Hi, I've had some limited experience with Oracle, SQL Server, Informix and at least one commercial in-memory database. More recently, I use mysql memory tables for fun speeding up bulk read-write operations such as: set max_heap_table_size=250*1024*1024; create table mem_proptbl (field_one

Re: coding question: user's global variables

2007-10-12 Thread Peter W.
James, I think you can put those variables inside the mapper or reducer without creating a separate public class. untested code follows... public static class R extends MapReduceBase implements Reducer { private static Set s=new HashSet(); public void reduce(WritableComparable

Multiple users using different classpaths per job

2007-10-12 Thread Xavier Stevens
We are using hadoop for multiple users and the DFS is using a shared directory for data as noted by FAQ #13. Is there a way to have hadoop use a different classpath per job? Currently if I startup the hadoop instance with no script modifications, and then run a job bin/hadoop classname

Hadoop UML ?

2007-10-12 Thread James Yu
Is there Hadoop UML class and sequence diagram available? Thanks, James Yu

Re: jdk6 on darwin

2007-10-12 Thread Peter W.
Hello, Does the jdk6 developer preview install in: /System/Library/Frameworks/JavaVM.framework/Versions? Is your CurrentJDK symbolic link changed? Will 12.3 Hadoop code still run? Thanks, Peter W.

Re: Hadoop UML ?

2007-10-12 Thread Dmitry
James, Guess you can create them on a fly using Rational Rose or any other case tool. Did not see them so far in hadoop repository. It will be nice to have them any way. Thanks, DT www.ejinz.com - Original Message - From: James Yu [EMAIL PROTECTED] To: hadoop-user@lucene.apache.org

RE: Question about valueaggregators in 0.14.1...

2007-10-12 Thread C G
Hi Runping and All: That fixed the problem. Of course my aggregator is now failing for a different reason, but that's an error in my code that I can fix. I am extremely grateful for your assistance! Thanks, C G Runping Qi [EMAIL PROTECTED] wrote: I am sorry I overlooked

RE: Question about valueaggregators in 0.14.1...

2007-10-12 Thread Runping Qi
Glad it works. HADOOP-1622 should fix the problem properly. Until then, the users have to use this kind of hacky obscure workaround:) Runping -Original Message- From: C G [mailto:[EMAIL PROTECTED] Sent: Friday, October 12, 2007 8:48 PM To: hadoop-user@lucene.apache.org Subject:

Re: HBase performance

2007-10-12 Thread Jason Watkins
- writes: a row oriented database writes the whole row regardless of whether or not values are supplied for every field or not. Space is reserved for null fields, so the number of bytes written is the same for every row. In a column oriented database, only the columns for which values