[ 
https://issues.apache.org/jira/browse/FLUME-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13609360#comment-13609360
 ] 

Hari Shreedharan commented on FLUME-1227:
-----------------------------------------

{quote}
1) WRT the concern on not depending on another channel, i went down this path 
since it looked like there was some consensus when i started. What alternative 
design do you have in mind ?

2) WRT change in memory/file channel breaking the Spillable channel: Could you 
expand a bit ? I am not familiar with replay order issue and how it can impact. 
I dont think there is any intrinsic assumption being made wrt to any specific 
channel's behavior. Just to be doubly sure, i made sure not to rely on a single 
type of overflow channel in all the tests. The only material dependency (as far 
as I can tell) that Spillable Channel has on the overflow is the interface 
level guarantee that is expected from all channels: that order is maintained in 
case of single source/sink. 
Do you see any other assumptions/dependencies hiding there ?
{quote}

I am sorry, I was not part of the initial discussions - so I was not aware of 
the consensus aspect. What I am saying is that being dependent on another 
channel creates an undesired strong coupling between this channel and the other 
channels. An if there are unit tests in this channel which can break if one of 
the other channels' behavior is changed, then it is not something that is 
acceptable. If you look at all our other components, none of them have a 
dependence on each other (except the RPCClients - that is because the sinks are 
just glorified RPCClients). 

The reason I would not agree with even the single source/sink replay order is 
that our interfaces do not really enforce this. This is not really even 
enforced anywhere in the documentation either. The FileChannel did not even 
conform to that single source/sink replay order until FLUME-1432. In fact, 
conforming to that order even in FLUME-1432 was a side-effect of fixing a race 
condition, and not specifically because it was meant to be handled. At some 
point, if it is decided this can change again to some other order (maybe a 
thread based ordering, or or an order in which events in a transaction will all 
get written out together on commit, rather than getting written out on put and 
fsynced on commit), then if this channels' tests break, the onus will be on the 
contributor who submitted the file channel change to fix it - which I do not 
agree with.

In summary, I am ok with depending on other channels. What I am not ok with is 
depending on the behavior of those channels, which are not explicitly 
guaranteed through interfaces (or even documentation).

bq. 3) WRT reserving capacity on both channels. If you mean that each txn 
should not reserve capacity on both channels. I agree. And the current 
implementation does not do that. Or were you by any chance referring to the 
issue of upfront reservation (at put() time) versus commit() time ?

I am talking about put v/s commit time. In most cases, transaction capacity is 
often configured to be much higher than the the max expected in most cases. I 
would suggest doing a full implementation where there is a transaction outside, 
and a backing store inside. Once the transaction is about to get committed, 
then decide where the events go. (It is going to be tricky to do this and avoid 
doing all the writes at once - the File Channel fsyncs on commit, but writes to 
OS buffers on every write - so it is possible some data is flushed to disk 
before explicit fsyncs). This is not a blocker anyway, we can work on it later 
as well.

bq. 4) WRT to testing with fsyncs removed, i have not pursued it since i felt 
that would be compromising the durability guarantees. Do you think its useful 
to do that ?

I was wondering whether simply adding a config param to change the fsyncs 
(fsync all files before checkpoint in parallel or something) to optional will 
give comparable performance to what is being proposed in this jira. I have a 
feeling it might, since fsyncs are the most expensive part of the file channel, 
and removing the fsyncs just writes to the in-memory OS buffer and the fsyncs 
will be taken care of in the background. 

{quote}
5) WRT "we should make the configuration change". Can you elaborate ? I am not 
certain which change specifically you are referring to. Or are you referring to 
the whole config approach ?
6) WRT lifecycle management and dependencies : After configuration, any channel 
that is found to be not connected with a source/sink is automatically discarded 
from the list of Life cycle system managed components. Consequently the 
Spillable Channel becomes the sole life cycle manager of the overflow channel. 
Otherwise, yes there would be havoc.
{quote}

I just think we should not allow one component to pull a reference to another 
component in the system. This explicitly breaks the "interact via interfaces" 
idea. We could make sure the spillable channel own both the channels (and 
manages the lifecycle of these) - to avoid components which end up being able 
to access other components owned by the lifecycle manager.


Hope I made myself clearer this time! 
                
> Introduce some sort of SpillableChannel
> ---------------------------------------
>
>                 Key: FLUME-1227
>                 URL: https://issues.apache.org/jira/browse/FLUME-1227
>             Project: Flume
>          Issue Type: New Feature
>          Components: Channel
>            Reporter: Jarek Jarcec Cecho
>            Assignee: Roshan Naik
>         Attachments: 1227.patch.1, SpillableMemory Channel Design.pdf
>
>
> I would like to introduce new channel that would behave similarly as scribe 
> (https://github.com/facebook/scribe). It would be something between memory 
> and file channel. Input events would be saved directly to the memory (only) 
> and would be served from there. In case that the memory would be full, we 
> would outsource the events to file.
> Let me describe the use case behind this request. We have plenty of frontend 
> servers that are generating events. We want to send all events to just 
> limited number of machines from where we would send the data to HDFS (some 
> sort of staging layer). Reason for this second layer is our need to decouple 
> event aggregation and front end code to separate machines. Using memory 
> channel is fully sufficient as we can survive lost of some portion of the 
> events. However in order to sustain maintenance windows or networking issues 
> we would have to end up with a lot of memory assigned to those "staging" 
> machines. Referenced "scribe" is dealing with this problem by implementing 
> following logic - events are saved in memory similarly as our MemoryChannel. 
> However in case that the memory gets full (because of maintenance, networking 
> issues, ...) it will spill data to disk where they will be sitting until 
> everything start working again.
> I would like to introduce channel that would implement similar logic. It's 
> durability guarantees would be same as MemoryChannel - in case that someone 
> would remove power cord, this channel would lose data. Based on the 
> discussion in FLUME-1201, I would propose to have the implementation 
> completely independent on any other channel internal code.
> Jarcec

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to