Re: Too many open files error, which gets resolved after some time

2009-06-23 Thread Stas Oskin
often enough for your cluster. I would recommend just increasing the limit on the node itself and then wait for an upgrade to solve this. Brian On Jun 23, 2009, at 3:31 AM, Stas Oskin wrote: Hi. Any idea if calling System.gc() periodically will help reducing the amount of pipes / epolls

Re: Too many open files error, which gets resolved after some time

2009-06-23 Thread Stas Oskin
Hi. In my testings, I typically opened between 20 and 40 concurrent streams. Regards. 2009/6/23 Raghu Angadi rang...@yahoo-inc.com Stas Oskin wrote: Hi. Any idea if calling System.gc() periodically will help reducing the amount of pipes / epolls? since you have HADOOP-4346, you should

Re: Too many open files error, which gets resolved after some time

2009-06-23 Thread Stas Oskin
Hi. So if I open one stream, it should be 4? 2009/6/23 Raghu Angadi rang...@yahoo-inc.com how many threads do you have? Number of active threads is very important. Normally, #fds = (3 * #threads_blocked_on_io) + #streams 12 per stream is certainly way off. Raghu. Stas Oskin wrote

Re: Too many open files error, which gets resolved after some time

2009-06-22 Thread Stas Oskin
file descriptors as the open file limit. On Sun, Jun 21, 2009 at 12:43 PM, Stas Oskin stas.os...@gmail.com wrote: Hi. Thanks for the advice. So you advice explicitly closing each and every file handle that I receive from HDFS? Regards. 2009/6/21 jason hadoop jason.had

Re: Too many open files error, which gets resolved after some time

2009-06-22 Thread Stas Oskin
Hi. So what would be the recommended approach to pre-0.20.x series? To insure each file is used only by one thread, and then it safe to close the handle in that thread? Regards. 2009/6/22 Steve Loughran ste...@apache.org Raghu Angadi wrote: Is this before 0.20.0? Assuming you have closed

Re: Too many open files error, which gets resolved after some time

2009-06-22 Thread Stas Oskin
Raghu Angadi rang...@yahoo-inc.com 64k might help in the sense, you might hit GC before you hit the limit. Otherwise, your only options are to use the patch attached to HADOOP-4346 or run System.gc() occasionally. I think it should be committed to 0.18.4 Raghu. Stas Oskin wrote: Hi

Re: Too many open files error, which gets resolved after some time

2009-06-22 Thread Stas Oskin
Ok, seems this issue is already patched in the Hadoop distro I'm using (Cloudera). Any idea if I still should call GC manually/periodically to clean out all the stale pipes / epolls? 2009/6/22 Steve Loughran ste...@apache.org Stas Oskin wrote: Hi. So what would be the recommended approach

Re: Too many open files error, which gets resolved after some time

2009-06-21 Thread Stas Oskin
0,6 6135319 pipe I'm using FSDataInputStream and FSDataOutputStream, so this might be related to pipes? So, my questions are: 1) What happens these pipes/epolls to appear? 2) More important, how I can prevent their accumation and growth? Thanks in advance! 2009/6/21 Stas Oskin

Re: InputStream.open() efficiency

2009-05-27 Thread Stas Oskin
Hi. Thanks for the advice. Regards. 2009/5/26 Raghu Angadi rang...@yahoo-inc.com Stas Oskin wrote: Hi. Thanks for the answer. Would up to 5 minute of handlers cause any issues? 5 min should not cause any issues.. And same about writing? writing is not affected by the couple

Re: Specifying NameNode externally to hadoop-site.xml

2009-05-26 Thread Stas Oskin
Hi. Thanks for the tip. Regards. 2009/5/26 Aaron Kimball aa...@cloudera.com Same way. Configuration conf = new Configuration(); conf.set(fs.default.name, hdfs://foo); FileSystem fs = FileSystem.get(conf); - Aaron On Mon, May 25, 2009 at 1:02 PM, Stas Oskin stas.os...@gmail.com wrote

When directly writing to HDFS, the data is moved only on file close

2009-05-26 Thread Stas Oskin
Hi. I'm trying to continuously write data to HDFS via OutputStream(), and want to be able to read it at the same time from another client. Problem is, that after the file is created on HDFS with size of 0, it stays that way, and only fills up when I close the OutputStream(). Here is a simple

InputStream.open() efficiency

2009-05-26 Thread Stas Oskin
Hi. I'm looking to find out, how the InputStream.open() + skip(), compares to keeping a handle of InputStream() and just seeking the position. Has anyone compared these approaches, and can advice on their speed? Regards.

Re: When directly writing to HDFS, the data is moved only on file close

2009-05-26 Thread Stas Oskin
, 2009 at 12:08 PM, Stas Oskin stas.os...@gmail.com wrote: Hi. I'm trying to continuously write data to HDFS via OutputStream(), and want to be able to read it at the same time from another client. Problem is, that after the file is created on HDFS with size of 0, it stays that way

Re: When directly writing to HDFS, the data is moved only on file close

2009-05-26 Thread Stas Oskin
summary on the HBase dev list: http://mail-archives.apache.org/mod_mbox/hadoop-hbase-dev/200905.mbox/%3c7c962aed0905231601g533088ebj4a7a068505ba3...@mail.gmail.com%3e Tom On Tue, May 26, 2009 at 12:08 PM, Stas Oskin stas.os...@gmail.com wrote: Hi. I'm trying to continuously write data

Re: InputStream.open() efficiency

2009-05-26 Thread Stas Oskin
() call. So you would save an RPC to NameNode. There are couple of issues that affect apps that keep the handlers open very long time (many hours to days).. but those will be fixed soon. Raghu. Stas Oskin wrote: Hi. I'm looking to find out, how the InputStream.open() + skip(), compares

Blocks amount is stuck in statistics

2009-05-25 Thread Stas Oskin
Hi. I just did an erase of large test folder with about 20,000 blocks, and created a new one. I copied about 128 blocks, and fsck reflects it correctly, but NN statistics still shows the old number. It does shows the currently used space correctly. Any idea if this a known issue and was fixed?

Re: Blocks amount is stuck in statistics

2009-05-25 Thread Stas Oskin
Hi. Ok, was too eager to report :). It got sorted out after some time. Regards. 2009/5/25 Stas Oskin stas.os...@gmail.com Hi. I just did an erase of large test folder with about 20,000 blocks, and created a new one. I copied about 128 blocks, and fsck reflects it correctly, but NN

Re: RandomAccessFile with HDFS

2009-05-25 Thread Stas Oskin
FSDataInputStream's seek() method). Writing at an arbitrary offset in an HDFS file is not supported however. Cheers, Tom On Sun, May 24, 2009 at 1:33 PM, Stas Oskin stas.os...@gmail.com wrote: Hi. Any idea if RandomAccessFile is going to be supported in HDFS? Regards.

LeaseExpiredException Exception

2009-05-25 Thread Stas Oskin
Hi. I have a process that writes to file on DFS from time to time, using OutputStream. After some time of writing, I'm starting getting the exception below, and the write fails. The DFSClient retries several times, and then fails. Copying the file from local disk to DFS via CopyLocalFile() works

Re: Specifying NameNode externally to hadoop-site.xml

2009-05-25 Thread Stas Oskin
, 2009 at 3:02 PM, Stas Oskin stas.os...@gmail.com wrote: Hi. I'm looking to move the Hadoop NameNode URL outside the hadoop-site.xml file, so I could set it at the run-time. Any idea how to do it? Or perhaps there is another configuration that can be applied to the FileSystem

Re: Specifying NameNode externally to hadoop-site.xml

2009-05-25 Thread Stas Oskin
Hi. And if I don't use jobs but only DFS for now? Regards. 2009/5/25 jason hadoop jason.had...@gmail.com conf.set(fs.default.name, hdfs://host:port); where conf is the JobConf object of your job, before you submit it. On Mon, May 25, 2009 at 10:16 AM, Stas Oskin stas.os...@gmail.com wrote

RandomAccessFile with HDFS

2009-05-24 Thread Stas Oskin
Hi. Any idea if RandomAccessFile is going to be supported in HDFS? Regards.

Specifying NameNode externally to hadoop-site.xml

2009-05-24 Thread Stas Oskin
Hi. I'm looking to move the Hadoop NameNode URL outside the hadoop-site.xml file, so I could set it at the run-time. Any idea how to do it? Or perhaps there is another configuration that can be applied to the FileSystem object? Regards.

Re: Shutdown in progress exception

2009-05-21 Thread Stas Oskin
After you've performed your application shutdown actions you should call FileSystem's closeAll() method. Ahh, thanks for the tip. Regards.

Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Stas Oskin
Hi. I'm testing Hadoop in our lab, and started getting the following message when trying to copy a file: Could only be replicated to 0 nodes, instead of 1 I have the following setup: * 3 machines, 2 of them with only 80GB of space, and 1 with 1.5GB * Two clients are copying files all the time

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Stas Oskin
suggesting this as i am also a new to hadoop Ashish Pareek On Thu, May 21, 2009 at 2:41 PM, Stas Oskin stas.os...@gmail.com wrote: Hi. I'm testing Hadoop in our lab, and started getting the following message when trying to copy a file: Could only be replicated to 0 nodes, instead

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Stas Oskin
Hi. I think you should file a jira on this. Most likely this is what is happening : Will do - this goes to DFS section, correct? * two out of 3 dns can not take anymore blocks. * While picking nodes for a new block, NN mostly skips the third dn as well since '# active writes' on it is

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Stas Oskin
Hi. If this analysis is right, I would add it can happen even on large clusters! I've seen this error at our cluster when we're very full (97%) and very few nodes have any empty space. This usually happens because we have two very large nodes (10x bigger than the rest of the cluster), and

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Stas Oskin
The real trick has been to make sure the balancer doesn't get stuck -- a Nagios plugin makes sure that the stdout has been printed to in the last hour or so, otherwise it kills the running balancer. Stuck balancers have been an issue in the past. Thanks for the advice.

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Stas Oskin
I think you should file a jira on this. Most likely this is what is happening : Here it is - hope it's ok: https://issues.apache.org/jira/browse/HADOOP-5886

Re: Shutdown in progress exception

2009-05-20 Thread Stas Oskin
Hi. 2009/5/20 Tom White t...@cloudera.com Looks like you are trying to copy file to HDFS in a shutdown hook. Since you can't control the order in which shutdown hooks run, this is won't work. There is a patch to allow Hadoop's FileSystem shutdown hook to be disabled so it doesn't close

Re: Shutdown in progress exception

2009-05-20 Thread Stas Oskin
You should only use this if you plan on manually closing FileSystems yourself from within your own shutdown hook. It's somewhat of an advanced feature, and I wouldn't recommend using this patch unless you fully understand the ramifications of modifying the shutdown sequence. Standard

Shutdown in progress exception

2009-05-17 Thread Stas Oskin
Hi. I have an issue where my application, when shutting down (at ShutdownHook level), is unable to copy files to HDFS. Each copy throws the following exception: java.lang.IllegalStateException: Shutdown in progress at

Could not complete file info?

2009-05-13 Thread Stas Oskin
Hi. I'm getting a very strange message, marked as INFO: 09/05/14 02:16:12 INFO dfs.DFSClient: Could not complete file /test/15334.bin retrying What does it mean, and is there any reason for concern? Thanks.

Re: Namenode failed to start with FSNamesystem initialization failed error

2009-05-06 Thread Stas Oskin
/). Stas Oskin wrote: Well, it definitely caused the SecondaryNameNode to crash, and also seems to have triggered some strange issues today as well. By the way, how the image file is named?

Re: Namenode failed to start with FSNamesystem initialization failed error

2009-05-05 Thread Stas Oskin
but no matching entry in namespace I also tried to recover from the secondary name node files but the corruption my too wide-spread and I had to format. Tamir On Mon, May 4, 2009 at 4:48 PM, Stas Oskin stas.os...@gmail.com wrote: Hi. Same conditions - where the space has run out and the fs

Re: Namenode failed to start with FSNamesystem initialization failed error

2009-05-05 Thread Stas Oskin
to store the the corrupt image? Can this be reproduced using the image? Usually you can recover manually from a corrupt or truncated image. But more importantly we want to find how it got in to this state. Raghu. Stas Oskin wrote: Hi. This quite worry-some issue. Can anyone advice

Re: Namenode failed to start with FSNamesystem initialization failed error

2009-05-05 Thread Stas Oskin
Actually, we discovered today an annoying bug in our test-app, which might have moved some of the HDFS files to the cluster, including the metadata files. I presume it could be the possible reason for such behavior? :) 2009/5/5 Stas Oskin stas.os...@gmail.com Hi Raghu. The only lead I have

Re: Namenode failed to start with FSNamesystem initialization failed error

2009-05-05 Thread Stas Oskin
Hi. 2009/5/5 Raghu Angadi rang...@yahoo-inc.com Stas Oskin wrote: Actually, we discovered today an annoying bug in our test-app, which might have moved some of the HDFS files to the cluster, including the metadata files. oops! presumably it could have removed the image file itself. I

Namenode failed to start with FSNamesystem initialization failed error

2009-05-04 Thread Stas Oskin
Hi. After rebooting the NameNode server, I found out the NameNode doesn't start anymore. The logs contained this error: FSNamesystem initialization failed I suspected filesystem corruption, so I tried to recover from SecondaryNameNode. Problem is, it was completely empty! I had an issue that

Re: Getting free and used space

2009-05-02 Thread Stas Oskin
edlinuxg...@gmail.com You can also pull these variables from the name node, datanode with JMX. I am doing this to graph them with cacti. Both the JMX READ/WRITE and READ user can access this variable. On Tue, Apr 28, 2009 at 8:29 AM, Stas Oskin stas.os...@gmail.com wrote: Hi. Any idea

Re: Blocks replication in downtime even

2009-04-28 Thread Stas Oskin
are removed in favor of new replicas ? (Or the new ones) regards Piotr 2009/4/27 Stas Oskin stas.os...@gmail.com Thanks. 2009/4/27 Koji Noguchi knogu...@yahoo-inc.com http://hadoop.apache.org/core/docs/current/hdfs_design.html#Data+Disk+Fa ilure%2C+Heartbeats+and+Re-Replication

Re: Getting free and used space

2009-04-28 Thread Stas Oskin
, at 8:51 AM, Stas Oskin wrote: Hi. Thanks for the explanation. Now for the easier part - how do I specify the user when connecting? :) Is it a config file level, or run-time level setting? Regards. 2009/4/8 Brian Bockelman bbock...@cse.unl.edu Hey Stas, Did you try

Blocks replication in downtime even

2009-04-27 Thread Stas Oskin
Hi. I have a question: If I have N of DataNodes, and one or several of the nodes have become unavailable, would HDFS re-synchronize the blocks automatically, according to replication level set? And if yes, when? As soon as the offline node was detected, or only on file access? Regards.

Re: Blocks replication in downtime even

2009-04-27 Thread Stas Oskin
this helps. Koji -Original Message- From: Stas Oskin [mailto:stas.os...@gmail.com] Sent: Monday, April 27, 2009 4:11 AM To: core-user@hadoop.apache.org Subject: Blocks replication in downtime even Hi. I have a question: If I have N of DataNodes, and one or several of the nodes have

Re: No route to host prevents from storing files to HDFS

2009-04-23 Thread Stas Oskin
Hi. Shouldn't you be testing connecting _from_ the datanode? The error you posted is while this DN is trying connect to another DN. You might be into something here indeed: 1) Telnet to 192.168.253.20 8020 / 192.168.253.20 50010 works 2) Telnet to localhost 8020 / localhost 50010 doesn't

Re: No route to host prevents from storing files to HDFS

2009-04-23 Thread Stas Oskin
Hi. I have one question, is the ip address consistent, I think in one of the thread mails, it was stated that the ip address sometimes changes. Same static IP's for all servers. By the way, I have the fs.default.name defined in IP address could it be somehow related? I read that there were

Re: No route to host prevents from storing files to HDFS

2009-04-23 Thread Stas Oskin
Hi. Maybe, but there will still be at least one virtual network adapter on the host. Try turning them off. Nope, still throws No route to host exceptions. I have another IP address defined on this machine - 192.168.253.21, for the same network adapter. Any idea if it has impact? The

Re: No route to host prevents from storing files to HDFS

2009-04-23 Thread Stas Oskin
Hi. 2009/4/23 Matt Massie m...@cloudera.com Just for clarity: are you using any type of virtualization (e.g. vmware, xen) or just running the DataNode java process on the same machine? What is fs.default.name set to in your hadoop-site.xml? This machine has OpenVZ installed indeed, but

Re: No route to host prevents from storing files to HDFS

2009-04-23 Thread Stas Oskin
, they were disabled on start-up, so they shouldn't come up in the first place. Regards. 2009/4/23 Stas Oskin stas.os...@gmail.com Hi. Also iptables -L for each machine as an afterthought - just for paranoia's sake Well, I started preparing all the information you requested, but when I got

Re: No route to host prevents from storing files to HDFS

2009-04-22 Thread Stas Oskin
Hi. 2009/4/22 jason hadoop jason.had...@gmail.com Most likely that machine is affected by some firewall somewhere that prevents traffic on port 50075. The no route to host is a strong indicator, particularly if the Datanote registered with the namenode. Yes, this was my first thought as

Re: No route to host prevents from storing files to HDFS

2009-04-22 Thread Stas Oskin
Hi. There is some mismatch here.. what is the expected ip address of this machine (or does it have multiple interfaces and properly routed)? Looking at the Receiving Block message DN thinks its address is 192.168.253.20 but NN thinks it is 253.32 (and client is able to connect using 253.32).

Re: No route to host prevents from storing files to HDFS

2009-04-22 Thread Stas Oskin
Hi. The way to diagnose this explicitly is: 1) on the server machine that should be accepting connections on the port, telnet localhost PORT, and telnet IP PORT you should get a connection, if not then the server is not binding the port. 2) on the remote machine verify that you can

Re: No route to host prevents from storing files to HDFS

2009-04-22 Thread Stas Oskin
Hi. Is it possible to paste the output from the following command on both your DataNode and NameNode? % route -v -n Sure, here it is: Kernel IP routing table Destination Gateway Genmask Flags Metric RefUse Iface 192.168.253.0 0.0.0.0 255.255.255.0 U 0

No route to host prevents from storing files to HDFS

2009-04-21 Thread Stas Oskin
Hi. I have quite a strange issue, where one of the datanodes that I have, rejects any blocks with error messages. I looked in the datanode logs, and found the following error: 2009-04-21 16:59:19,092 ERROR org.apache.hadoop.dfs.DataNode: DatanodeRegistration(192.168.253.20:50010,

Re: No route to host prevents from storing files to HDFS

2009-04-21 Thread Stas Oskin
to host error? 2009/4/21 Stas Oskin stas.os...@gmail.com Hi. I have quite a strange issue, where one of the datanodes that I have, rejects any blocks with error messages. I looked in the datanode logs, and found the following error: 2009-04-21 16:59:19,092 ERROR org.apache.hadoop.dfs.DataNode

HDFS and web server

2009-04-14 Thread Stas Oskin
Hi. Has any succeed running web-server from HDFS? I mean, to serve websites and application directly from HDFS, perhaps via FUSE/WebDav? Regards.

Re: HDFS and web server

2009-04-14 Thread Stas Oskin
Hi. 2009/4/14 Michael Bieniosek micb...@microsoft.com webdav server - https://issues.apache.org/jira/browse/HADOOP-496 There's a fuse issue somewhere too, but I never managed to get it working. As far as serving websites directly from HDFS goes, I would say you'd probably have better luck

Running balancer throws exceptions

2009-04-12 Thread Stas Oskin
Hi. When I run hadoop balancer, I get the following error: 09/04/12 10:28:46 INFO dfs.Balancer: Will move 3.02 GBbytes in this iteration Apr 12, 2009 10:28:46 AM 0 0 KB19.02 GB3.02 GB 09/04/12 10:28:46 INFO dfs.Balancer: Decided to move block

Re-balancing blocks

2009-04-11 Thread Stas Oskin
Hi. Should I call the block re-balancer every time a new DataNode is added or removed? Or what is the recommended procedure of re-balancing the blocks for better fault tolerance? Regards.

NameNode resilency

2009-04-11 Thread Stas Oskin
Hi. I wonder, what Hadoop community uses in order to make NameNode resilient to failures? I mean, what High-Availability measures are taken to keep the HDFS available even in case of NameNode failure? So far I read a possible solution using DRBD, and another one using carp. Both of them had the

Re: NameNode resilency

2009-04-11 Thread Stas Oskin
Hi. Any tutorial about using Zookeeper with NameNode? Thanks! 2009/4/12 Billy Pearson sa...@pearsonwholesale.com Not 100% sure but I thank they plan on using zookeeper to help with namenode fail over but that may have changed. Billy Stas Oskin stas.os...@gmail.com wrote in message news

Re: HDFS read/write speeds, and read optimization

2009-04-10 Thread Stas Oskin
Hi. Hypertable (a BigTable implementation) has a good KFS vs. HDFS breakdown: http://code.google.com/p/hypertable/wiki/KFSvsHDFS From this comparison it seems KFS is quite faster then HDFS for small data transfers (for SQL commands). Any idea if same holds true for small-medium (20Mb - 150

Re: HDFS read/write speeds, and read optimization

2009-04-10 Thread Stas Oskin
Hi. Depends. What hardware? How much hardware? Is the cluster under load? What does your I/O load look like? As a rule of thumb, you'll probably expect very close to hardware speed. Standard Xeon dual cpu, quad core servers, 4 GB RAM. The DataNodes also do some processing, with usual

Does the HDFS client read the data from NameNode, or from DataNode directly?

2009-04-10 Thread Stas Oskin
Hi. I wanted to verify a point about HDFS client operations: When asking for file, is the all communication done through the NameNode? Or after being pointed to correct DataNode, does the HDFS works directly against it? Also, NameNode provides a URL named streamFile which allows any HTTP client

Thin version of Hadoop jar for client

2009-04-10 Thread Stas Oskin
Hi. Is there any thin version of Hadoop jar, specifically for Java client? Regards.

Re: HDFS read/write speeds, and read optimization

2009-04-10 Thread Stas Oskin
Thanks for sharing. For comparison, on a 1400 node cluster, I can checksum 100 TB in around 10 minutes, which means I'm seeing read averages of roughly 166 GB/sec. For writes with replication of 3, I see roughly 40-50 minutes to write 100TB, so roughly 33 GB/sec average. Of course the peaks

Two degrees of replications reliability

2009-04-10 Thread Stas Oskin
Hi. I know that there were some hard to find bugs with replication set to 2, which caused data loss to HDFS users. Was there any progress with these issues, and if there any fixes which were introduced? Regards.

Re: Does the HDFS client read the data from NameNode, or from DataNode directly?

2009-04-10 Thread Stas Oskin
Thanks, this is what I thought. Regards. 2009/4/10 Alex Loddengaard a...@cloudera.com Data is streamed directly from the data nodes themselves. The name node is only queried for block locations and other meta data. Alex On Fri, Apr 10, 2009 at 8:33 AM, Stas Oskin stas.os...@gmail.com

Re: HDFS read/write speeds, and read optimization

2009-04-10 Thread Stas Oskin
Hi. Depends on what kind of I/O you do - are you going to be using MapReduce and co-locating jobs and data? If so, it's possible to get close to those speeds if you are I/O bound in your job and read right through each chunk. If you have multiple disks mounted individually, you'll need the

Re: Two degrees of replications reliability

2009-04-10 Thread Stas Oskin
2009/4/10 Brian Bockelman bbock...@cse.unl.edu Most of the issues were resolved in 0.19.1 -- I think 0.20.0 is going to be even better. We run about 300TB @ 2 replicas, and haven't had file loss that was Hadoop's fault since about January. Brian And you running 0.19.1? Regards.

Re: Two degrees of replications reliability

2009-04-10 Thread Stas Oskin
Actually, now I remember that you posted some time ago about your University loosing about 300 files. So since then the situation has improved I presume? 2009/4/10 Stas Oskin stas.os...@gmail.com 2009/4/10 Brian Bockelman bbock...@cse.unl.edu Most of the issues were resolved in 0.19.1 -- I

Re: Does the HDFS client read the data from NameNode, or from DataNode directly?

2009-04-10 Thread Stas Oskin
Hi. What happens here is that the NameNode redirects you to a smartly (a data node that has some of the file's first 5 blocks, I think) chosen DataNode, and that DataNode proxies the file for you. Specifically, the assembling of a full file from multiple nodes is happening on that

Listing the files in HDFS directory

2009-04-10 Thread Stas Oskin
Hi. I must be completely missing it in API, but what is the function to list the directory contents? Thanks!

Abandoning block messages

2009-04-10 Thread Stas Oskin
Hi. I'm testing HDFS fault-tolerance, and randomly disconnecting and powering off the chunk nodes to see how HDFS withstands it. After I have disconnected a chunk node, the logs have started to fill with the following errors: 09/04/11 02:25:12 INFO dfs.DFSClient: Exception in

Re: Listing the files in HDFS directory

2009-04-10 Thread Stas Oskin
Hi. Just wanted to tell that FileStatus did the trick. Regards. 2009/4/11 Stas Oskin stas.os...@gmail.com Hi. I must be completely missing it in API, but what is the function to list the directory contents? Thanks!

HDFS read/write speeds, and read optimization

2009-04-09 Thread Stas Oskin
Hi. I have 2 questions about HDFS performance: 1) How fast are the read and write operations over network, in Mbps per second? 2) If the chunk server is located on same host as the client, is there any optimization in read operations? For example, Kosmos FS describe the following functionality:

Getting free and used space

2009-04-08 Thread Stas Oskin
Hi. I'm trying to use the API to get the overall used and free spaces. I tried this function getUsed(), but it always returns 0. Any idea? Thanks.

Re: Getting free and used space

2009-04-08 Thread Stas Oskin
permission errors... in most of the released versions, getUsed() is only available to the Hadoop superuser. It may be that the exception isn't propagating correctly. Brian On Apr 8, 2009, at 3:13 AM, Stas Oskin wrote: Hi. I'm trying to use the API to get the overall used and free spaces. I

Re: Hadoop for real time

2008-10-20 Thread Stas Oskin
cases and video files aren't real splittable for map-reduce purposes. That might mean that you could get away with a mogile-ish system. On Tue, Oct 14, 2008 at 1:29 PM, Stas Oskin [EMAIL PROTECTED] wrote: Hi. Video storage, processing and streaming. Regards. 2008/9/25 Edward J

Re: Hadoop for real time

2008-10-14 Thread Stas Oskin
Hi. Video storage, processing and streaming. Regards. 2008/9/25 Edward J. Yoon [EMAIL PROTECTED] What kind of the real-time app? On Wed, Sep 24, 2008 at 4:50 AM, Stas Oskin [EMAIL PROTECTED] wrote: Hi. Is it possible to use Hadoop for real-time app, in video processing field

Hadoop for real time

2008-09-23 Thread Stas Oskin
Hi. Is it possible to use Hadoop for real-time app, in video processing field? Regards.