[jira] [Resolved] (HADOOP-1222) Record IO C++ binding: buffer type not handled correctly
[ https://issues.apache.org/jira/browse/HADOOP-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved HADOOP-1222. - Resolution: Won't Fix Resolving as Won't Fix, since the whole recordio component is now deprecated in favor of Avro (and technically ought to be removed in 0.22/0.23). Please see https://issues.apache.org/jira/browse/HADOOP-6155 Record IO C++ binding: buffer type not handled correctly Key: HADOOP-1222 URL: https://issues.apache.org/jira/browse/HADOOP-1222 Project: Hadoop Common Issue Type: Bug Components: record Reporter: David Bowen Attachments: test.cc I added this code to the test, which currently only tests serialization/deserialization of an empty buffer. std::string b = r1.getBufferVal(); static char buffer[] = {0, 1, 2, 3, 4, 5}; for (int i = 0; i 6; i++) { b.push_back(buffer[i]); } The csv test fails. The generated file looks like this. T,102,4567,99344109427290,3.145000,1.523400,',# 0 1 2 3 4 5 0 1 2 3 4 5,v{},m{} The xml test passes, but the data in the xml file is wrong: valuestring000102030405000102030405000102030405/string/value -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-1223) Record IO C++ binding: non-empty vector of strings does not work
[ https://issues.apache.org/jira/browse/HADOOP-1223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved HADOOP-1223. - Resolution: Won't Fix Resolving as Won't Fix, since the whole recordio component is now deprecated in favor of Avro (and technically ought to be removed in 0.22/0.23). Please see https://issues.apache.org/jira/browse/HADOOP-6155 Record IO C++ binding: non-empty vector of strings does not work Key: HADOOP-1223 URL: https://issues.apache.org/jira/browse/HADOOP-1223 Project: Hadoop Common Issue Type: Bug Components: record Reporter: David Bowen Attachments: test.cc It works in the binary case, but not in CSV or XML. Here is the code to put some strings in the vector. std::vectorstd::string v = r1.getVectorVal(); v.push_back(hello); v.push_back(world); In the CSV file, the strings appear twice, for some reason. In the XML file they appear three times. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-1277) The class generated by Hadoop Record rcc should provide a static method to return the DDL string
[ https://issues.apache.org/jira/browse/HADOOP-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved HADOOP-1277. - Resolution: Won't Fix Resolving as Won't Fix, since the whole recordio component is now deprecated in favor of Avro (and technically ought to be removed in 0.22/0.23). Please see https://issues.apache.org/jira/browse/HADOOP-6155 The class generated by Hadoop Record rcc should provide a static method to return the DDL string Key: HADOOP-1277 URL: https://issues.apache.org/jira/browse/HADOOP-1277 Project: Hadoop Common Issue Type: New Feature Components: record Reporter: Runping Qi The method will look like: public static string getDDL(); With this class, when a map/reduce job write out sequence file swith such a generated class as its value class, the job can also save the DDL of the class into a file. With such a file around, we can implement a record reader that can generate the required class on demand, thus, can read a sequence file of Hadoop Records without having the class a priori. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-1225) Record IO class should provide a toString(String charset) method
[ https://issues.apache.org/jira/browse/HADOOP-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved HADOOP-1225. - Resolution: Won't Fix Resolving as Won't Fix, since the whole recordio component is now deprecated in favor of Avro (and technically ought to be removed in 0.22/0.23). Please see https://issues.apache.org/jira/browse/HADOOP-6155 Record IO class should provide a toString(String charset) method Key: HADOOP-1225 URL: https://issues.apache.org/jira/browse/HADOOP-1225 Project: Hadoop Common Issue Type: Improvement Components: record Reporter: Runping Qi Assignee: Sameer Paranjpye Currently, the toString() function returns the csv format serialized form of the record object. Unfortunately, all the fields of Buffer type are serialized into hex string. Although this is a loss less conversion, it is not the most convenient form, when perople use Buffer to store international texts. With a new function toString(String charset) , the user can pass a charset to indicate the desired way to convert a Buffer to a String. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-1095) Provide ByteStreams in C++ version of record I/O
[ https://issues.apache.org/jira/browse/HADOOP-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved HADOOP-1095. - Resolution: Won't Fix Resolving as Won't Fix, since the whole recordio component is now deprecated in favor of Avro (and technically ought to be removed in 0.22/0.23). Please see https://issues.apache.org/jira/browse/HADOOP-6155 Provide ByteStreams in C++ version of record I/O Key: HADOOP-1095 URL: https://issues.apache.org/jira/browse/HADOOP-1095 Project: Hadoop Common Issue Type: Improvement Components: record Affects Versions: 0.12.0 Environment: All Reporter: Milind Bhandarkar Assignee: Vivek Ratan Implement ByteInStream and ByteOutStream for C++ runtime, as they will be needed for using Hadoop Record I/O with forthcoming C++ MapReduce framework (currently, only FileStreams are provided.) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-1227) Record IO C++ binding: cannot write more than one record to an XML stream and read them back
[ https://issues.apache.org/jira/browse/HADOOP-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved HADOOP-1227. - Resolution: Won't Fix Resolving as Won't Fix, since the whole recordio component is now deprecated in favor of Avro (and technically ought to be removed in 0.22/0.23). Please see https://issues.apache.org/jira/browse/HADOOP-6155 Record IO C++ binding: cannot write more than one record to an XML stream and read them back Key: HADOOP-1227 URL: https://issues.apache.org/jira/browse/HADOOP-1227 Project: Hadoop Common Issue Type: Bug Components: record Reporter: David Bowen I tried just writing the same record twice and then reading it back twice, and got a segmentation fault. This works fine in the binary and csv cases. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-6712) librecordio support for xerces 3, eliminate compiler warnings and the (optional) ability to compile in the source directory
[ https://issues.apache.org/jira/browse/HADOOP-6712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved HADOOP-6712. - Resolution: Won't Fix Resolving as Won't Fix, since the whole recordio component is now deprecated in favor of Avro (and technically ought to be removed in 0.22/0.23). Please see https://issues.apache.org/jira/browse/HADOOP-6155 Do reopen if you wish to maintain recordio over 0.20.x branches. Although Avro works with 0.20 as well. librecordio support for xerces 3, eliminate compiler warnings and the (optional) ability to compile in the source directory --- Key: HADOOP-6712 URL: https://issues.apache.org/jira/browse/HADOOP-6712 Project: Hadoop Common Issue Type: Bug Components: record Environment: 64-bit linux w/gcc 4.4.3 w/xerces 3 Reporter: John Plevyak Attachments: librecordio-jp-v1.patch I don't know if this code is current supported, but since it is in the tree here are some fixes: 1. support for xerces 3.X as well as 2.X the patch checks XERCES_VERSION_MAJOR and I have tested on 3.X but before committing, someone should retest on 2.X 2. gcc 4.4.3 on 64-bit complains about using %lld with int64_t. Casting to 'long long int' solves the issue 3. since there is currently no ant target, check if LIBRECORDIO_BUILD_DIR is undefined and if so assume '.' to support compiling in the source directory This should not effect normal compilation if/when an ant target is created. patch attached -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-1916) FSShell put or CopyFromLocal incorrectly treats .
[ https://issues.apache.org/jira/browse/HADOOP-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved HADOOP-1916. - Resolution: Fixed Fixed by a redesign on trunk. Please https://issues.apache.org/jira/browse/HADOOP-7176 for the umbrella of changes leading to. FSShell put or CopyFromLocal incorrectly treats . --- Key: HADOOP-1916 URL: https://issues.apache.org/jira/browse/HADOOP-1916 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 0.14.1 Reporter: Konstantin Shvachko Assignee: Chris Douglas Attachments: 1916.patch The following dfs shell command {code} bin/hadoop dfs -put README.txt . {code} results in creating a file /user/user name with the contents of README.txt. A correct behavior would be creating a directory and a file in it: /user/user name/README.txt The put command works correctly if /user/user name already exists. So the following sequence of command leads to the desired result: {code} bin/hadoop dfs -mkdir . bin/hadoop dfs -put README.txt . {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-1919) Add option to allow Binding Jetty to localhost
[ https://issues.apache.org/jira/browse/HADOOP-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved HADOOP-1919. - Resolution: Not A Problem Not a problem. Can be doable by the previous comment, with logical repercussions. Add option to allow Binding Jetty to localhost -- Key: HADOOP-1919 URL: https://issues.apache.org/jira/browse/HADOOP-1919 Project: Hadoop Common Issue Type: New Feature Affects Versions: 0.14.0 Reporter: Thurman Turner Priority: Minor We would like a configurable option to have Jetty bound to the loopback address of the machine so that the dfs-browser is not accessible from outside the host. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-1994) Variable names generated by Record I/O should not clash with user fields
[ https://issues.apache.org/jira/browse/HADOOP-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved HADOOP-1994. - Resolution: Won't Fix Record IO has been deprecated with the advent of Avro. Please see HADOOP-6155 Resolving as Won't Fix. Variable names generated by Record I/O should not clash with user fields Key: HADOOP-1994 URL: https://issues.apache.org/jira/browse/HADOOP-1994 Project: Hadoop Common Issue Type: Bug Reporter: Vivek Ratan Assignee: Vivek Ratan The code (Java and C++) spit out by the Record I/O compiler contains variables. We need to make sure these variable names don't clash with names used by users in the DDL, otherwise the generated code will not compile. Variable names such as 'a', 'peer', etc, are used. We need better names. For example, if I have a DDL of the form {code} class s1 { int a; boolean peer; int a_; } {code} Both the Java and C++ code will not compile. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: java.lang.Throwable: Child Error And Task process exit with nonzero status of 1.
The job may have succeeded due to the task having run successfully on another tasktracker after a retry attempt was scheduled. This probably means one of your TT has something bad on it, and should be easily identifiable from the UI. If all TTs are bad, your job would fail -- so yes, better to fix than worry about expecting failures. On Mon, Jul 11, 2011 at 11:53 PM, C.V.Krishnakumar Iyer f2004...@gmail.com wrote: Hi, I get this error too. But the Job completes properly. Is this error any cause for concern? As in, would any computation be hampered because of this? Thanks ! Regards, Krishnakumar On Jul 11, 2011, at 10:53 AM, Bharath Mundlapudi wrote: That number is around 40K (I think). I am not sure if you have certain configurations to cleanup user task logs periodically. We have solved this problem in MAPREDUCE-2415 which part of 0.20.204. But you cleanup the task logs periodically, you will not run into this problem. -Bharath -- Harsh J
Re: XXXWritable
Do have a look at Apache Avro's use with MapReduce. It helps solve some issues related with serialization in the way you are talking about: http://avro.apache.org On Sat, Jul 2, 2011 at 7:59 AM, Raja Nagendra Kumar nagendra.r...@tejasoft.com wrote: Hi, I read in Definitive guide that, LongWritable and xxWritables are more optimized versions for network serialization of normal Java Long etc.. If that is so, would be not be easy for developers to use normal Java long and string, so hadoop framework can internally convert the developer written code to use the longwritable etc or through some intermediate code conversions. This approach can greatly reduce the no of api and helps in faster learning. Not sure, why developer had to write and use XXXWritable etc in the context of optimized for network serialization Regards, Raja Nagendra Kumar, C.T.O www.tejasoft.com -Hadoop Adoption Consulting -- View this message in context: http://old.nabble.com/XXXWritable-tp31977841p31977841.html Sent from the Hadoop core-dev mailing list archive at Nabble.com. -- Harsh J
[jira] [Resolved] (HADOOP-7396) The information returned by the wrong usage of the command hadoop job -events job-id from-event-# #-of-events is not appropriate
[ https://issues.apache.org/jira/browse/HADOOP-7396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved HADOOP-7396. - Resolution: Duplicate The information returned by the wrong usage of the command hadoop job -events job-id from-event-# #-of-events is not appropriate Key: HADOOP-7396 URL: https://issues.apache.org/jira/browse/HADOOP-7396 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.23.0 Reporter: Yan Jinshuang Priority: Minor Fix For: 0.23.0 With wrong value of from-event-# and #-of-events, though the from-events-# after the #-of-events for example from 1000 to 1, the command always return 0.It is expected to show detailed information, like the start number should be less than the end number for range of events. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HADOOP-7328. Improve the SerializationFactory functions.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/884/ --- (Updated 2011-06-16 12:13:34.081758) Review request for hadoop-common and Todd Lipcon. Changes --- Throw exceptions (getting rid of nulls). Add appropriate javadocs and fix one checkstyle nit. Summary --- Since getSerialization() can possibly return a null, it is only right that getSerializer() and getDeserializer() usage functions do the same, instead of throwing up NPEs. Related issue to which this improvement is required: https://issues.apache.org/jira/browse/MAPREDUCE-2584 This addresses bug HADOOP-7328. http://issues.apache.org/jira/browse/HADOOP-7328 Diffs (updated) - src/java/org/apache/hadoop/io/serializer/SerializationFactory.java dee314a Diff: https://reviews.apache.org/r/884/diff Testing --- Existing SequenceFile serialization factory tests pass. The change is merely to make the functions return null instead of throwing an NPE within. Thanks, Harsh
[jira] [Resolved] (HADOOP-3436) Useless synchronized in JobTracker
[ https://issues.apache.org/jira/browse/HADOOP-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved HADOOP-3436. - Resolution: Not A Problem Does not appear to be a problem w.r.t. trunk. There is no such variable held (a collection is used instead, and that requires to hold JT lock and is synchronized (per comments)). Resolving as Not a problem (anymore). Stale issue. Useless synchronized in JobTracker -- Key: HADOOP-3436 URL: https://issues.apache.org/jira/browse/HADOOP-3436 Project: Hadoop Common Issue Type: Improvement Reporter: Brice Arnould Assignee: Brice Arnould Priority: Trivial In the original code, numTaskTrackers is fetch in a synchronized way, which is useless because anyway it might be change during the running of the algorithm. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Problems about the job counters
Hello, When you have a Reduce phase, the mapper needs to (sort and) materialize KVs to local files to let reducers fetch it. This is where the FILE_BYTES_* counters appear from. Similarly, the Reducer fetches and stores on local disk and merge sorts them again, thus they appear for reduce phase as well. In a map-only job, you should not generally see any FILE_BYTES_* counters. On Wed, Jun 15, 2011 at 9:32 AM, hailong.yang1115 hailong.yang1...@gmail.com wrote: Dear all, I am trying to the built-in example wordcount with nearly 15GB input. When the Hadoop job finished, I got the following counters. CounterMapReduceTotal Job CountersLaunched reduce tasks001 Rack-local map tasks0035 Launched map tasks002,318 Data-local map tasks002,283 FileSystemCountersFILE_BYTES_READ22,863,580,65617,654,943,34140,518,523,997 HDFS_BYTES_READ154,400,997,4590154,400,997,459 FILE_BYTES_WRITTEN33,490,829,40317,654,943,34151,145,772,744 HDFS_BYTES_WRITTEN02,747,356,7042,747,356,704 My question is what does the FILE_BYTES_READ counter mean? And what is the difference between FILE_BYTES_READ and HDFS_BYTES_READ? In my opinion, all the input is located in HDFS, so where does FILE_BYTES_READ come from during the map phase? Any help will be appreciated! Hailong 2011-06-15 *** * Hailong Yang, PhD. Candidate * Sino-German Joint Software Institute, * School of Computer ScienceEngineering, Beihang University * Phone: (86-010)82315908 * Email: hailong.yang1...@gmail.com * Address: G413, New Main Building in Beihang University, * No.37 XueYuan Road,HaiDian District, * Beijing,P.R.China,100191 *** -- Harsh J
[jira] [Reopened] (HADOOP-6219) Add dumpConfiguration option in hadoop help message
[ https://issues.apache.org/jira/browse/HADOOP-6219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J reopened HADOOP-6219: - Sorry, pretty strange that both link to same issue? Ideally it should be under mapred project now. Add dumpConfiguration option in hadoop help message --- Key: HADOOP-6219 URL: https://issues.apache.org/jira/browse/HADOOP-6219 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.20.1 Reporter: Ramya R Assignee: V.V.Chaitanya Krishna Priority: Trivial Fix For: 0.23.0 Attachments: HADOOP-6184-ydist.patch, HADOOP-6219-ydist.patch, MAPREDUCE-919.patch, MAPREDUCE-919.patch Execution of bin/hadoop should show the -dumpConfiguration option introduced in MAPREDUCE-768 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-5624) @Override cleanup for Eclipse
[ https://issues.apache.org/jira/browse/HADOOP-5624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved HADOOP-5624. - Resolution: Not A Problem Eclipse does not seem to complain for any of the patch's changes. I do not see any @Override issues on trunk of all three projects right now. Please do not hesitate to re-open in case it is still an issue as per. @Override cleanup for Eclipse - Key: HADOOP-5624 URL: https://issues.apache.org/jira/browse/HADOOP-5624 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.21.0 Reporter: Carlos Valiente Priority: Trivial Attachments: HADOOP-5624.patch Eclipse complains about several methods which are marked as {{@Override}}, but which are not defined in any superclass. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-6936) broken links in http://wiki.apache.org/hadoop/FAQ#A12//s
[ https://issues.apache.org/jira/browse/HADOOP-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved HADOOP-6936. - Resolution: Fixed Just for a history lesson ref. here: There used to be hadoop-*.xml files once upon a time. Its now split over to core-*, hdfs-*, mapred-* files (* - {site, default}). Closing as the HowToConfigure link has also been updated by me. Although it needs more love in general (We should switch to confluence… its more encouraging). broken links in http://wiki.apache.org/hadoop/FAQ#A12//s Key: HADOOP-6936 URL: https://issues.apache.org/jira/browse/HADOOP-6936 Project: Hadoop Common Issue Type: Bug Components: documentation Reporter: Eugene Koontz Priority: Trivial http://wiki.apache.org/hadoop/FAQ#A12//s has links to : http://hadoop.apache.org/core/docs/current/hadoop-default.html#dfs.replication.min http://hadoop.apache.org/common/docs/current/hadoop-default.html#dfs.safemode.threshold.pct both of which are 404 as of time of filing this issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HADOOP-7328. Improve the SerializationFactory functions.
On 2011-06-12 02:24:55, Todd Lipcon wrote: Looks good to me. Can you upload this rev of the patch to the JIRA so the QA Bot runs on it? Submitted on JIRA. Thanks for the review Todd! - Harsh --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/884/#review805 --- On 2011-06-11 22:10:17, Harsh J wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/884/ --- (Updated 2011-06-11 22:10:17) Review request for hadoop-common and Todd Lipcon. Summary --- Since getSerialization() can possibly return a null, it is only right that getSerializer() and getDeserializer() usage functions do the same, instead of throwing up NPEs. Related issue to which this improvement is required: https://issues.apache.org/jira/browse/MAPREDUCE-2584 This addresses bug HADOOP-7328. http://issues.apache.org/jira/browse/HADOOP-7328 Diffs - src/java/org/apache/hadoop/io/serializer/SerializationFactory.java dee314a Diff: https://reviews.apache.org/r/884/diff Testing --- Existing SequenceFile serialization factory tests pass. The change is merely to make the functions return null instead of throwing an NPE within. Thanks, Harsh
Re: 404 on Learn about link
Bruno, While someone would eventually get to fix this live link error, the right page for the current release is at: http://hadoop.apache.org/common/docs/current/ instead of stable (just in case one does not know). On Sun, Jun 5, 2011 at 8:39 AM, Bruno P. Kinoshita brunodepau...@yahoo.com.br wrote: Hi there, I am receiving 404 when I click on Learn about link in Hadoop Common page [2]. Could somebody with karma check to see if it is a problem or if it is just down for maintenance or something similar, please? TYIA, Bruno [1] http://hadoop.apache.org/common/docs/stable/ [2] http://hadoop.apache.org/common/ -- Harsh J
Re: Question regarding network data transfer
Aishwarya, On Sun, May 29, 2011 at 6:49 AM, Aishwarya Venkataraman avenk...@cs.ucsd.edu wrote: So how does reducer obtain the mapper's output ? Does it make a network call and read data from mappers local storage or does the mapper send the data ? The mappers store the files at a location that is accessibly by the TaskTracker's HTTP servlet. The reducer fetches all successful map attempt outputs from the TaskTrackers when they initialize. -- Harsh J
[jira] [Created] (HADOOP-7328) Give more information about a missing Serializer class
Give more information about a missing Serializer class -- Key: HADOOP-7328 URL: https://issues.apache.org/jira/browse/HADOOP-7328 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.20.2 Reporter: Harsh J Chouraria Assignee: Harsh J Chouraria Fix For: 0.23.0 When you have a key/value class that's non Writable and you forget to attach io.serializers for the same, an NPE is thrown by the tasks with no information on why or what's missing and what led to it. I think a better exception can be thrown by SerializationFactory instead of an NPE when a class is not found accepted by any of the loaded ones. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HADOOP-7297) Error in the documentation regarding Checkpoint/Backup Node
[ https://issues.apache.org/jira/browse/HADOOP-7297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J Chouraria reopened HADOOP-7297: --- Reopening since the issue of docs is valid. There are CN and BN node docs on the tagged svn rev: http://svn.apache.org/repos/asf/hadoop/common/tags/release-0.20.203.0/src/docs/src/documentation/content/xdocs/hdfs_user_guide.xml Error in the documentation regarding Checkpoint/Backup Node --- Key: HADOOP-7297 URL: https://issues.apache.org/jira/browse/HADOOP-7297 Project: Hadoop Common Issue Type: Bug Components: documentation Affects Versions: 0.20.203.0 Reporter: arnaud p Priority: Trivial On http://hadoop.apache.org/common/docs/r0.20.203.0/hdfs_user_guide.html#Checkpoint+Node: the command bin/hdfs namenode -checkpoint required to launch the backup/checkpoint node does not exist. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: MapReduce compilation error
Its a streaming test thing. Have a look at: https://issues.apache.org/jira/browse/MAPREDUCE-1686 On Thu, May 19, 2011 at 1:30 AM, Niels Basjes ni...@basjes.nl wrote: Today I ran into the same error and I was puzzled by the content of this file. What is the purpose of a test file that appears to have a deliberate error and no code what so ever? 2011/3/19 Harsh J qwertyman...@gmail.com: This shouldn't really interfere with your development. You may try to exclude it from Eclipse's build, perhaps. On Sat, Mar 19, 2011 at 1:39 AM, bikash sharma sharmabiks...@gmail.com wrote: Hi, When I am compiling MapReduce source code after checking-in Eclipse, I am getting the following error: The declared package does not match the expected package testjar ClassWithNoPackage.java Hadoop-MR/src/test/mapred/testjar Any thoughts? Thanks, Bikash -- Harsh J http://harshj.com -- Met vriendelijke groeten, Niels Basjes -- Harsh J
Re: How HDFS decides where to put the block
Hello, On Mon, Apr 18, 2011 at 7:16 PM, Nan Zhu zhunans...@gmail.com wrote: Hi, all I'm confused by a question that how does the HDFS decide where to put the data blocks I mean that the user invokes some commands like ./hadoop put ***, we assume that this file consistes of 3 blocks, but how HDFS decides where these 3 blocks to be put? Most of the materials don't involve this issue, but just introduce the data replica where talking about blocks in HDFS, I'm guessing you're looking for the BlockPlacementPolicy implementations [1] and how it is applied in the HDFS. Basically, the NameNode chooses the set of DNs for every new-block request (from a client) using this policy, and the DFSClient gets a list of all the nodes. It goes on to pick the first one among them to write the data to. The replication happens async, later. [1] - BlockPolicyPlacementDefault is the default implementation in use. It's source available in the o.a.h.hdfs.server.namenode package. -- Harsh J
Re: Getting error in Eclipse setup (SVN issue)
Hello Shyam, On Sat, Apr 16, 2011 at 4:05 AM, Shyam Sarkar shyam.s.sar...@gmail.com wrote: Hello, When I try to download from main trunk of Hadoop SVN I get following error : SVN connector cannot be loaded. This doesn't appear to be a Hadoop issue really. You need to verify your Eclipse's SVN plugin installation, etc. (sounds like it does not have a proper connector installed). You can alternatively get the svn copy using the local 'svn' program and do an 'ant eclipse' to get Eclipse project files. -- Harsh J
Re: MapReduce compilation error
This shouldn't really interfere with your development. You may try to exclude it from Eclipse's build, perhaps. On Sat, Mar 19, 2011 at 1:39 AM, bikash sharma sharmabiks...@gmail.com wrote: Hi, When I am compiling MapReduce source code after checking-in Eclipse, I am getting the following error: The declared package does not match the expected package testjar ClassWithNoPackage.java Hadoop-MR/src/test/mapred/testjar Any thoughts? Thanks, Bikash -- Harsh J http://harshj.com
Re: pointers to Hadoop eclipse
http://wiki.apache.org/hadoop/EclipseEnvironment On Thu, Mar 17, 2011 at 8:17 PM, bikash sharma sharmabiks...@gmail.com wrote: Hi, Can someone please point to any good reference that tells clearly how to checkout Hadoop code base in eclipse, make any changes and re-compile. Actually, I wanted to change some part in Hadoop, so wants to see the above effect, preferrably in eclipse. Thanks, Bikash -- Harsh J http://harshj.com
[jira] Created: (HADOOP-7192) fs -stat docs aren't updated to reflect the format features
fs -stat docs aren't updated to reflect the format features --- Key: HADOOP-7192 URL: https://issues.apache.org/jira/browse/HADOOP-7192 Project: Hadoop Common Issue Type: Improvement Components: documentation Affects Versions: 0.21.0 Environment: Linux / 0.21 Reporter: Harsh J Chouraria Assignee: Harsh J Chouraria Priority: Trivial Fix For: 0.23.0 The html docs of the 'fs -stat' command (that is found listed in the File System Shell Guide), does not seem to have the formatting abilities of -stat explained (along with the options). Like 'fs -help', the docs must also reflect the latest available features. I shall attach a doc-fix patch shortly. If anyone has other discrepancies to point out in the web version of the guide, please do so :) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: File access pattern on HDFS?
There is no such information (history of atime changes, although atime is held for every file in the NN) held by the NameNode right now. I think HDFS-782 is slightly relevant to maintaining a 'hot-zone' info, although at a block level and among datanodes. I couldn't find a jira that talks about keeping a list of atime modifications on the NameNode. On Mon, Mar 7, 2011 at 4:00 AM, Gautam Singaraju gautam.singar...@gmail.com wrote: Hi, Is there a mechanism to get the list of files accessed on HDFS at the NameNode? Thanks! --- Gautam -- Harsh J www.harshj.com
Re: measure the resource usage of each map/reduce task
Hello, On Tue, Mar 1, 2011 at 7:29 PM, bikash sharma sharmabiks...@gmail.com wrote: Hi, As a follow-up question, do map/reduce tasks run as threads or processes? Every launched Task runs as an independent process, communicating over a network interface (lo) with the TaskTracker for reporting/etc. purposes. -- Harsh J www.harshj.com
Re: source versioning question
0.22 had been branched from the trunk quite a while ago (I think that signifies a feature freeze). So the trunk is now heading for the 0.23 development. On Tue, Jan 11, 2011 at 11:22 AM, Noah Watkins jayh...@cs.ucsc.edu wrote: What is the relation between the current trunk and branch-0.22? Is trunk the current dev for 0.23 or 0.22? Thanks, Noah -- Harsh J www.harshj.com
Re: Developing Hadoop in Eclipse
You can launch them (The daemons) from Eclipse itself -- there must be a launch target provided in 0.21 if am right, OR you can build a fresh tar using `ant tar` target. Schedulers are also pluggable in Hadoop, so you can develop one without needing to edit Hadoop's sources. Check contrib/ for the capacity/fair schedulers, for example. -- Harsh J www.harshj.com
Re: Process ID and Hadoop job ID
Hi, On Wed, Dec 8, 2010 at 3:18 PM, radheshyam nanduri radheshyam.nand...@gmail.com wrote: Hi, I want to know if there is any way to find out the process id (PID) of a task running on a TaskTracker corresponding to a particular Hadoop job ID. All the Hadoop tasks are launched as java processes. So, is there any way to differentiate among them to get the PID of a particular task of a particular Hadoop job. Not sure if there's a way to get the launched PIDs, but there are TaskIDs available for every TaskInProgress decided for a job (and every execution attempt thereof). -- Harsh J www.harshj.com
Re: FileInputFormat.setInputPaths Problem
Hi, 2010/12/4 Rawan AlSaad rawan.als...@hotmail.com: I need to know how to pass the input folder path to the java class throught the function( FileInputFormat.setInputPaths(conf, new Path(input)) Try FileInputFormat.addInputPath(...) for a single path entry at a time perhaps? I'm not sure what's going wrong here though. -- Harsh J www.harshj.com
Re: Configure hadoop in eclipse
Hi, On Fri, Nov 5, 2010 at 8:12 PM, Rafael Braga rafaeltelemat...@gmail.com wrote: anybody? Saw in the link: http://osdir.com/ml/dev-harmony-apache/2010-10/msg00017.html that it's necessary includ the jar: sun-javadoc.jar, it's correct? I don't seem to have that JAR on my build path here in Eclipse. All default Sun JRE jars + Hadoop lib jars + ANT_HOME env property seem enough. And when I tell of connection, I talk about attchment, sorry. Don't think the ML allows attachments. On Thu, Nov 4, 2010 at 1:06 PM, Rafael Braga rafaeltelemat...@gmail.comwrote: Sorry, it was a problem on my connection. thanks, On Thu, Nov 4, 2010 at 1:00 PM, Nan Zhu zhunans...@gmail.com wrote: attachment missed? Nan On Thu, Nov 4, 2010 at 11:35 PM, Rafael Braga rafaeltelemat...@gmail.com wrote: Hi everybody, I follow the tutorial: http://wiki.apache.org/hadoop/EclipseEnvironment and saw the screencast: http://vimeo.com/4193623. The buld's.xml ran whitout problems but after I turn on Project...Build Automatically erros happen in the class: ExcludePrivateAnnotationsJDiffDoclet (see attachment). And in the view Problems of eclipse same erros are shown too (see attachment). what might be wrong? thanks, -- Rafael Braga -- Rafael Braga http://www.linkedin.com/myprofile?trk=hb_tab_pro -- Rafael Braga http://www.linkedin.com/myprofile?trk=hb_tab_pro -- Harsh J www.harshj.com
Re: i want to contribute
On Mon, Oct 25, 2010 at 10:51 AM, goutham patnaik goutham.patn...@gmail.com wrote: i've been looking into contributing the code base and figured writing test cases for the common mapreduce examples was a good place to start. i got this idea ofcourse, from the main project suggestions page. i was wondering about the status of the following jira ticket : https://issues.apache.org/jira/browse/MAPREDUCE-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel Looks like a person was already working on it. Perhaps it'd be best to contact him/her before progressing onto this issue? You can find that person's contact details here: https://issues.apache.org/jira/secure/ViewProfile.jspa?name=kanjilal says its still open ? is anybody working on this actively right now ? if not, id like to have a go at it Goutham -- Harsh J www.harshj.com
Re: what happens inside hadoop !!
The source is your friend. And perhaps a good Java IDE too. I use Eclipse + F3. But since you ask, you may begin at the wiki: http://wiki.apache.org/hadoop/FrontPage There's stuff there not many see, and those pretty much cover enough to get you started at the right places :) About a document, I guess O'Malley's Hadoop MR Arch one would help, but nothing beats reading sources the way its supposed to be done: http://docs.huihoo.com/apache/hadoop/HadoopMapReduceArch.pdf On Fri, Oct 22, 2010 at 8:01 PM, Ahmad Shahzad ashahz...@gmail.com wrote: Hi ALL, Is there any documentation or guide or any presentation about what happens inside hadoop. I mean, there are different documentation about map-reduce and hdfs and they tell what the do, but what is happening inside is not mentioned in those articles. Any idea !! Ahmad -- Harsh J www.harshj.com
Re: HELP !!!! configuring hadoop on ECLIPSE
Hi, If all you want to do is to write programs that use your stable hadoop libraries, have a look at the Hadoop Eclipse plugin that comes along (inside contrib folders). If you want your stable hadoop as a project inside your eclipse itself, run `ant eclipse` in the hadoop's extracted directory (or was it eclipse-files?) and then import the folder using the 'Existing Projects into Workspace' import option. Alternatively, for the former requirement, you may use KarmaSphere's Hadoop eclipse plugin (the free Community Edition). It's a great tool to use as well. Both the Apache-suppled plugin and the KarmaSphere plugin allows you to run your MR code instantly onto a supplied cluster. And then some. P.s. You might need to patch the existing apache-supplied hadoop eclipse plugin a bit to make it usable on the latest versions of Eclipse. A shameless self-blog-reference follows: http://www.harshj.com/2010/07/18/making-the-eclipse-plugin-work-for-hadoop/ On Wed, Aug 11, 2010 at 9:33 PM, Ahmad Shahzad ashahz...@gmail.com wrote: Hi Saikat, Can you please provide more detail on how to do it. I tried creating a new java project, but i dont know how to associate hadoop source folders. Secondly, i tried creating eclipse project from existing an Ant buildfile and gives it hadoop build file that is in hadoop directory, but it asks me to select the javac decleration to use to define project and gives me a set of options such as : javac task found in target compile-rcc-compiler javac task found in target compile-core-classes javac task found in target compile-mapred-classes javac task found in target compile-hdfs-classes javac task found in target compile-tools javac task found in target compile-examples javac task found in target compile-core-test javac task found in target compile-ant-tasks i tried it with compile-rcc-compiler and compile-ant-tasks but it gives me the following error: problem setting classpath of the project from the javac classpath: Reference ivu-common.classpath not found. I will apprecite your reply. Ahmad -- Harsh J www.harshj.com
Re: How to Build Hadoop code in eclipse
Running the `ant eclipse-files` target will give you nearly usable .project and .classpath files. Import the Hadoop project into Eclipse using these. Or you could always checkout a stable branch/tag via SVN and go ahead with the original wiki instructions :) On Wed, Aug 11, 2010 at 6:02 PM, Ahmad Shahzad ashahz...@gmail.com wrote: Hi All, I wanted to ask a related question to this one. How would you set up hadoop on eclipse if you dont want to download it from svn, rather you just want to configure a stable release e.g 0.20.2 on eclipse. So, i want to configure a stable release on eclipse and add/change the code i want and run it through ant. Ahmad On Sun, Aug 8, 2010 at 4:36 AM, Saikat Kanjilal sxk1...@hotmail.com wrote: I've been able to build the code successfully in Eclipse by using the svn plugin and importing the code and using ant. I actually followed the wiki instructions and did an svn checkout inside Eclipse and was able to run all of the ant targets successfully. Sent from my iPhone On Aug 7, 2010, at 6:31 PM, thinke365 thinke...@gmail.com wrote: Maybe the official way to build hadoop is using hudson, the developers just using vim to make their work done, without IDE such as Eclipse. In my opinion, hadoop did badly to cooperate with IDE. ashish pareek wrote: Hello Friends, If you know solution to this problem please reply back. On Mon, Jul 27, 2009 at 2:58 PM, ashish pareek pareek...@gmail.com wrote: Hi Everybody, Is there any easy and elaborate page where its explained how to build hadoop code. I followed http://*wiki*. apache.org/*hadoop*/* Eclipse*Environment instruction and even video but i getting error : BUILD FAILED : java.net.UnknownHostException : repo2.maven.org But when accessed through browser this site is working . I browse through proxy and I have set up user name ans password correctly. Can any one suggest the possible soultion ? Thanks in advance. Regards, Ashish -- View this message in context: http://old.nabble.com/How-to-Build-Hadoop-code-in-eclipse-tp24676996p29377931.html Sent from the Hadoop core-dev mailing list archive at Nabble.com. -- Harsh J www.harshj.com
Re: proxy settings for ivy
Ensure you've set your ANT_OPTS for this, before issuing the ant command. For example: set ANT_OPTS=-Dhttp.proxyHost=kaboom -Dhttp.proxyPort=2888 There are similar options available for authenticated proxy also :) On Mon, Jul 12, 2010 at 9:41 PM, Ahmad Shahzad ashahz...@gmail.com wrote: Hi ALL, Can anyone tell me where i set the proxy settings for ivy. I am unable to build hadoop using ant. It says BUILD FAILED java.net.ConnectException: Connection refused.The reason is that i am connected through a proxy to internet.So, where should i tell hadoop to use the proxy. Regards, Ahmad Shahzad -- Harsh J www.harshj.com
Re: HOW to COMPILE HADOOP
Use the ant build.xml (and the provided targets) bundled along? On Fri, Jul 2, 2010 at 8:51 PM, Ahmad Shahzad ashahz...@gmail.com wrote: Hi ALL, Can anyone tell me that how will i compile the whole hadoop directory if i add some files to hadoop core directory or i change some code in some of the files. Regards, Ahmad Shahzad -- Harsh J www.harshj.com