[ 
https://issues.apache.org/jira/browse/CHUKWA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739887#action_12739887
 ] 

Ari Rabkin commented on CHUKWA-369:
-----------------------------------

@Jerome. 

I don't believe we can ever have multiple clients writing the same sequence 
file.  Exactly one writer per HDFS file, and writer.add() is always called on 
behalf of exactly one client.  So add() can return, unambiguously, the offset 
at which that client's data will be written. 

Yes.  Writes are asynchronous.  But looking at the length of the file tells you 
if the write committed or not. The whole point of what I'm proposing is to 
allow the check for whether the write succeeded to be asynchronous, too.  There 
is no need to keep track of what's in memory.  All there is is "data committed 
to the file, and therefore visible as written", and "data not yet committed, 
that we'll re-write if a timeout has expired".


@Eric:
I don't think a status check once every five minutes, per agent, would be a 
problem. That's a small fraction of the number of POSTs that we do -- and we 
retransmit very aggressively when the collector returns a 500 error, there.  So 
this shouldn't make things worse than they already are.

The interface you're describing -- where the collector takes responsibility of 
the data as soon as it returns an HTTP response -- is incompatible with your 
option (1), because we're not willing to wait a whole minute before returning 
an HTTP response -- we'd run out of request handling threads in the server.  

I don't know what measurements you're citing.  We never implemented option 1. 
What we've actually implemented is an unreliable version of (1), where the 
collector takes responsibility -- *and leaves data in RAM* when it responds.  
The collector, using the code we've written, simply does not flush data before 
responding.  So those benchmarks don't really apply. Likewise, the 
LocalFSWriter doesn't commit data before responding.  The code we actually have 
returns "OK" *before* saving the data to disk. 

@both:

I see there's substantial disagreement here.  So I think it makes sense for me 
to implement and test at scale before submitting a patch for review.  If I 
submit measurements, at scale, demonstrating what I have in mind would that be 
likely to sway you?

> proposed reliability mechanism
> ------------------------------
>
>                 Key: CHUKWA-369
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-369
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: data collection
>    Affects Versions: 0.3.0
>            Reporter: Ari Rabkin
>             Fix For: 0.3.0
>
>
> We like to say that Chukwa is a system for reliable log collection. It isn't, 
> quite, since we don't handle collector crashes.  Here's a proposed 
> reliability mechanism.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to