[ 
https://issues.apache.org/jira/browse/TEZ-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14984701#comment-14984701
 ] 

Bikas Saha commented on TEZ-2918:
---------------------------------

I think the sorting, merging and spilling code that use the Progressable 
object, inherit the progress notification behavior from MR where they call 
progress() after processing a block of records and not on every record. I can 
check that again.
So e.g. in collect, on a write the record is just added to the current buffer. 
It does not cause any tight loop operation (sort etc.) until it needs to 
spill). However if the user code is call write() in a tight loop then yes, this 
is a tight loop. Same for read in MRInput.

This is what we were discussing in TEZ-808, was how to make the progress call 
cheap enough that it should not matter. However, with cross thread visibility 
not guaranteed in Oracle JVM because it does not guarantee that a busy thread 
will ever be interrupted, we had to use volatile. I took a cue from the patch 
where Gopal changed counter to use atomic long instead of synchronized block 
for generic counters and that removed the perf bottleneck. Each read/write 
increments a counter and so that should be on the code path already. atomic 
vars effectively have a volatile read/write visibility barrier.

In any case, perf can only be measured. While doing all the perf work you 
mention, did we create any perf benchmark code or test that can be used to 
measure this before and after? Could you please point me to ways to measure 
this that have been used earlier.

> Make progress notifications in IOs
> ----------------------------------
>
>                 Key: TEZ-2918
>                 URL: https://issues.apache.org/jira/browse/TEZ-2918
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>         Attachments: TEZ-2918.1.patch, TEZ-2918.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to