I expect the impact on the IO speed to be almost 0 because waiting for a single disk seek is longer than many thousands of calls to a synchronized method.
Niels On Aug 11, 2013 3:00 PM, "Harsh J" <[email protected]> wrote: > Yes, I feel we could discuss this over a JIRA to remove it if it hurts > perf. too much, but it would have to be a marked incompatible change, > and we have to add a note about the lack of thread safety in the > javadoc of base Mapper/Reducer classes. > > On Sun, Aug 11, 2013 at 1:26 PM, Sathwik B P <[email protected]> wrote: > > Hi Harsh, > > > > Does it make any sense to keep the method in LRW still synchronized. > Isn't > > it creating unnecessary overhead for non multi threaded implementations. > > > > regards, > > sathwik > > > > > > On Fri, Aug 9, 2013 at 7:16 AM, Harsh J <[email protected]> wrote: > >> > >> I suppose I should have been clearer. There's no problem out of box if > >> people stick to the libraries we offer :) > >> > >> Yes the LRW was marked synchronized at some point over 8 years ago [1] > >> in support for multi-threaded maps, but the framework has changed much > >> since then. The MultithreadedMapper/etc. API we offer now > >> automatically shields the devs away from having to think of output > >> thread safety [2]. > >> > >> I can imagine there can only be a problem if a user writes their own > >> unsafe multi threaded task. I suppose we could document that in the > >> Mapper/MapRunner and Reducer APIs. > >> > >> [1] - http://svn.apache.org/viewvc?view=revision&revision=171186 - > >> Commit added a synchronized to the write call. > >> [2] - MultiThreadedMapper/etc. synchronize over the collector - > >> > >> > http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/map/MultithreadedMapper.java?view=markup > >> > >> On Thu, Aug 8, 2013 at 7:52 PM, Azuryy Yu <[email protected]> wrote: > >> > sequence writer is also synchronized, I dont think this is bad. > >> > > >> > if you call HDFS api to write concurrently, then its necessary. > >> > > >> > On Aug 8, 2013 7:53 PM, "Jay Vyas" <[email protected]> wrote: > >> >> > >> >> Then is this a bug? Synchronization in absence of any race condition > >> >> is > >> >> normally considered "bad". > >> >> > >> >> In any case id like to know why this writer is synchronized whereas > the > >> >> other one are not.. That is, I think, then point at issue: either > other > >> >> writers should be synchronized or else this one shouldn't be - > >> >> consistency > >> >> across the write implementations is probably desirable so that > changes > >> >> to > >> >> output formats or record writers don't lead to bugs in multithreaded > >> >> environments . > >> >> > >> >> On Aug 8, 2013, at 6:50 AM, Harsh J <[email protected]> wrote: > >> >> > >> >> While we don't fork by default, we do provide a MultithreadedMapper > >> >> implementation that would require such synchronization. But if you > are > >> >> asking is it necessary, then perhaps the answer is no. > >> >> > >> >> On Aug 8, 2013 3:43 PM, "Azuryy Yu" <[email protected]> wrote: > >> >>> > >> >>> its not hadoop forked threads, we may create a line record writer, > >> >>> then > >> >>> call this writer concurrently. > >> >>> > >> >>> On Aug 8, 2013 4:00 PM, "Sathwik B P" <[email protected]> wrote: > >> >>>> > >> >>>> Hi, > >> >>>> Thanks for your reply. > >> >>>> May I know where does hadoop fork multiple threads to use a single > >> >>>> RecordWriter. > >> >>>> > >> >>>> regards, > >> >>>> sathwik > >> >>>> > >> >>>> On Thu, Aug 8, 2013 at 7:06 AM, Azuryy Yu <[email protected]> > wrote: > >> >>>>> > >> >>>>> because we may use multi-threads to write a single file. > >> >>>>> > >> >>>>> On Aug 8, 2013 2:54 PM, "Sathwik B P" <[email protected]> wrote: > >> >>>>>> > >> >>>>>> Hi, > >> >>>>>> > >> >>>>>> LineRecordWriter.write(..) is synchronized. I did not find any > >> >>>>>> other > >> >>>>>> RecordWriter implementations define the write as synchronized. > >> >>>>>> Any specific reason for this. > >> >>>>>> > >> >>>>>> regards, > >> >>>>>> sathwik > >> >>>> > >> >>>> > >> > > >> > >> > >> > >> -- > >> Harsh J > > > > > > > > -- > Harsh J >
