[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2019-06-06 Thread Lars Francke (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858329#comment-16858329
 ] 

Lars Francke commented on YARN-6875:


Is there any user-facing change/config stuff for this and if so do we have any 
documentation?

> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Major
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2019-06-06 Thread Lars Francke (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858105#comment-16858105
 ] 

Lars Francke commented on YARN-6875:


Thank you for the clarification. In that case, I believe closing this makes 
sense.

> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Major
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2019-06-06 Thread Tan, Wangda (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858095#comment-16858095
 ] 

Tan, Wangda commented on YARN-6875:
---

[~larsfrancke], this is already usable in append-able file system like HDFS. 

I'd prefer to close this ticket and leave the remaining tasks open.

> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Major
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2019-06-06 Thread Adam Antal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16857493#comment-16857493
 ] 

Adam Antal commented on YARN-6875:
--

As my understanding the last issue around the new file format is to make it 
compatible with non-appendable filesystems, which is addressed in the following 
two new subtasks: YARN-9525, YARN-9607. 

I suggest to close this, after those two subtasks will have been committed.

> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Major
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2019-05-15 Thread Lars Francke (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16840348#comment-16840348
 ] 

Lars Francke commented on YARN-6875:


All subtasks have been resolved here but the issue is still OPEN. Is this 
feature complete and implemented, are we using this new format already? Can we 
close the issue?

> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Major
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2017-08-08 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16118925#comment-16118925
 ] 

Wangda Tan commented on YARN-6875:
--

[~jlowe], make sense.

> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2017-08-08 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16118917#comment-16118917
 ] 

Jason Lowe commented on YARN-6875:
--

As long as the extra index file is only going to be used for the append case 
rather than the create case I'm OK with proceeding with the separate file.  As 
you mentioned, we can switch to a scan method if necessary.  Please add a 
unique marker in the metainfo block, as that allows the reader to sanity-check 
that it has properly located the block at the end of the file and also allows 
for scanning if necessary.

> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2017-08-08 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16118840#comment-16118840
 ] 

Wangda Tan commented on YARN-6875:
--

[~jlowe], make sense, it looks like we have to come back to solution proposed 
by Xuan: 
https://issues.apache.org/jira/browse/YARN-6875?focusedCommentId=16109148&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16109148.
 I suggest we can do the simpler fix as a starting point and improve it later.

Thoughts?

> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2017-08-07 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117123#comment-16117123
 ] 

Jason Lowe commented on YARN-6875:
--

I'm not clear on how the local index file would work in practice.  One use case 
for aggregated logs is to run a cluster job that analyzes the log files in 
HDFS.  How would a cluster job handle accessing the local index file for one of 
the log aggregation files?  Is it proxying to the local index file via the 
nodemanager corresponding to the log aggregation file?  If so then that means 
the job now needs sufficient credentials to be able to authenticate to an NM to 
read log aggregation files.

If the NM can fixup failed appends then the main difference between the new 
proposal and one that simply scans when it can't immediately find the metainfo 
is the likelihood the reader will read during an append.  The NM 
crashes/disappears failure case is the same, so if we agree that trying to read 
during a massive append is going to be similarly rare then I'd rather avoid 
having log readers contacting the NM if possible.


> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2017-08-02 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111886#comment-16111886
 ] 

Wangda Tan commented on YARN-6875:
--

[~jlowe], [~xgong],
I'm thinking this issue, probably we can create a local index file instead of 
remote index file to void extra overload to NN.

Do you think if following solution is reasonable:
- Local log aggregator always maintain a separate confirmed index file on 
*local dir* 
- When we need to do partial log aggregation, we always read the local index 
file, and replace it once partial log aggregation finishes. 
- For the under-appending file, we will try to load local index file. (I think 
this is possible).
- If appending fails, and NM will retry, we will follow the same logic above. 
- If appending fails, and NM is alive and will not retry, it will append index 
file to the remote file. 
- If appending fails, and NM is not alive, it follow Jason's logic to scan 
where's the last index. This should be rare.

Hope to hear your thoughts.

> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2017-08-01 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16109148#comment-16109148
 ] 

Xuan Gong commented on YARN-6875:
-

Thanks for the suggestion. [~leftnoteasy]
The approach looks fine. But it would introduce extra complexity if we enable 
the compression for the log aggregation. Instead of appending UUID + block_id 
for every fixed N bytes, we would append them for every aggregated log which 
might be easier given that we do the compression for every log type separately.

> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2017-07-31 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16107821#comment-16107821
 ] 

Wangda Tan commented on YARN-6875:
--

Thanks [~jlowe], 

bq. Quite a few important points to note here:
#1/#2 are true, however our original goal of the JIRA is not to just be a 
slightly better than old format.

For #3, it is not true when append fails.

For example, we have a file which appended 3 times (did partial log aggregation 
for 3 times). File looks like:
{code}
|Data-1|Index-1|Data-2|Index-2|Data-3|Index-3|
{code} 

At 4-th time, append fails in middle (such as NM failure, etc.)
{code}
|Data-1|Index-1|Data-2|Index-2|Data-3|Index-3|Data-4...(corrupted)|
{code} 

When we need to read logs, we need to go back all the way back to index-3, 
depends on how much we write for Data-4, this could be costly.
And the worse thing is, if Data-4 is not fixed by some reason. In the future 
time we need to read the app log again, we need to reverse-find where's the 
index-3.

There's another solution in my mind, in addition to Jason's suggestion before:

When we append logs for every partial log aggregation, we will append UUID + 
block_id for every N bits (N could = 64MB for example). Data looks like:
{code}
|Data-block-1-0|UUID_1_0|Data-block-1-1|UUID_1_1|Index-1|Data-block-2-0|UUID_2_0|Index-2|
{code} 

If append fails because of some reason, we will go back to search the last 
UUID+block_ID. For example:
{code}
|.good-data|.bad-data.|UUID_x_y|.bad-data.|
{code}

The last UUID+block_id is UUID_x_y. So we will know that, the last corrupted 
data has y more blocks in front of the position, so it will skip y * 
(BLOCK_SIZE + UUID_SIZE) bits. Which will be better than scan blocks one-by-one.

Thoughts? [~xgong].

> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2017-07-31 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16107375#comment-16107375
 ] 

Jason Lowe commented on YARN-6875:
--

To be clear, I'm not a fan of the current approach for partial aggregations 
that generates a separate file per pass.  I think we're all in agreement that 
partial aggregations should not result in multiple files after the operation 
completes.  I'm just proposing a way to avoid any additional files, even 
transient, during partial aggregation.  We already need some kind of marker for 
the metainfo block so the reader can know with certainly it has found a proper 
metainfo block, otherwise the race condition I pointed out above will result in 
undefined behavior for the reader.  I'm proposing we leverage this marker so we 
can avoid the need for a transient index file.

bq. However, if we don't write the (temp) index file, and the approach listed 
in Jason's comment will make read become very slow since it need to repeatedly 
find where's the last successful write. And the worst part is, we only need to 
read logs when app fails or slow, it will be likely that we will read such app 
logs for a couple of times. I don't think it will be a good user experience to 
do this every-time.

Quite a few important points to note here:
# The read scan won't be as slow as it is today.  Today it has to decompress 
each block in order to locate the next block.  The scan for the metainfo marker 
would not require any decompression, just a straight read.
# The read scan today must start from the beginning of the file, so it has to 
read (and decompress!) the worst-case amount of data to find logs at the end.  
For the metainfo scan we only need to scan from the end of the file to the 
first metainfo block we find.  That means, worst-case, we're only going to read 
(without decompressing) the amount of data for the last append operation 
currently in progress to locate any log in the file.
# The read scan only needs to occur when we are trying to read during an append 
operation.  This will only be a repeating process if the append operation is 
still ongoing when we try to do subsequent reads.

I would argue this scan is going to be much faster than you are assuming, and 
we only need to perform it when there is an ongoing append.  What is the 
anticipated duty cycle of append operations?  How likely will the repeated read 
scan scenario occur in practice, and to a point where the scan is not fast 
enough?

bq. what's the percentage of apps running in your cluster which enabled partial 
log aggregation?

We currently do not have any partial aggregations enabled in our clusters.  The 
number of additional files it creates today are one of the obstacles to 
creating it, but as we see longer and longer running apps on our clusters we 
will eventually need a partial aggregation solution.  Hopefully we're in 
agreement that no transient index file should be created during a normal log 
aggregation, and we're only debating what to do for partial aggregations.

> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2017-07-28 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105871#comment-16105871
 ] 

Wangda Tan commented on YARN-6875:
--

Thanks for comments from [~jlowe]/[~xgong]. 

I think I misled Jason before, we didn't plan to add the separate index design 
at beginning, but we figured out it is required for recovery. 

I agree the points from Jason:
- Log files are rarely read after write.
- Creation of  a separate index file during write means 2x workload of 
Namenode. 

However, if we don't write the (temp) index file, and the approach listed in 
Jason's comment will make read become very slow since it need to repeatedly 
find where's the last successful write. And the worst part is, we only need to 
read logs when app fails or slow, it will be likely that we will read such app 
logs for a couple of times. I don't think it will be a good user experience to 
do this every-time. 

I agree with comments from Xuan, if partial log aggregation is not enabled, 
this design doesn't increase any workload. [~jlowe], what's the percentage of 
apps running in your cluster which enabled partial log aggregation? 

For partial log aggregation case, an alternative solution is to write log+index 
to a separate file every time, which makes write perf exactly same as TFile, 
but read performance can be much better. Jason, could you share your thoughts 
here?


> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2017-07-28 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105859#comment-16105859
 ] 

Xuan Gong commented on YARN-6875:
-

Thanks for the comments. [~jlowe]. I fully understand your consideration. But,

bq.  I'm not a big fan of having a separate file, even temporarily, because log 
aggregation can already be a large portion of the namenode's write load on 
large clusters. Having that separate file will increase the namenode write load 
significantly (approximately 2x per log aggregation cycle if I understand it 
correctly).

I agree with this. But the proposed solution will not be worse than current 
solution (TFile). Also, the index file will be created only when the partially 
log aggregation is enabled.
If we enable partially log aggregation:
* For T-File solution (currently used), we would create a new file every time 
we do the log aggregation. If we have done log aggregation three times, we 
would have three T-Files
* For the proposed solution, at most, we would have two files: the log file and 
index file.

bq. Note that the separate index file doesn't solve all the race conditions for 
the reader.

Yes, this corn case is valid. But I think that this is OK. The reader would 
fail in this case, but we can always retry the reader later.

> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2017-07-27 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16103982#comment-16103982
 ] 

Robert Kanter commented on YARN-6875:
-

[~xgong], have you taken a look at YARN-2942 and subtasks?  I tried to do 
something like this a while ago and we went through a few different designs (I 
think there are 3 major different approaches, and some minor revisions for 
each); one of the approaches was very similar to your design, where there's an 
index file.

In the end, we decided to do something completely different (MAPREDUCE-6415) by 
adding a command to combine log files into HAR files.  This was to help with 
the too-many-small-files problem; though we still kept the T-files, so the goal 
was slightly different.  

Anyway, I did write a bunch of code for YARN-2942 and some subtasks before we 
canned it, so you might want to take a look in case you find something useful 
in there or the design documents.

> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2017-07-27 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16103346#comment-16103346
 ] 

Jason Lowe commented on YARN-6875:
--

Thanks for posting the doc!  I'm not a big fan of having a separate file, even 
temporarily, because log aggregation can already be a large portion of the 
namenode's write load on large clusters.  Having that separate file will 
increase the namenode write load significantly (approximately 2x per log 
aggregation cycle if I understand it correctly).

Note that the separate index file doesn't solve all the race conditions for the 
reader.  For example, this sequence:
# Reader checks for an index file which is not there
# Writer begins append and creates index file and starts appending
# Reader seeks to the end of the log file but does _not_ find the metainfo 
structure because the writer is in the process of appending more data

This could be mitigated by having the reader repeat the attempt to read process 
from the beginning so it can rediscover the index file, but this requires that 
the reader is capable of recognizing that it is _not_ looking at a proper 
metainfo block on that first attempt.  The document does not cover this 
necessary rinse-repeat cycle required on the reader's part, nor how a reader 
can reliably identify the case where it is not looking at a proper metainfo 
block because it happened to try to read just as an append operation occurs.

I'm wondering if we can eliminate the need for the index file, and thus reduce 
the write load on the namenode, by having the reader be able to discover the 
metainfo file even during an append operation.  Similar to sync markers in 
SequenceFile, we could create a unique, UUID-like sync marker that is written 
out before every metainfo block.  The reader would attempt to find the metainfo 
block normally (i.e.: seek to the last 64 bits of the file, read the 64-bit 
offset, then seek back that far to check for a metainfo block).  If it finds it 
then great, the reader is ready to read whatever it is looking for.  If it does 
not find a proper metainfo file then it can start scanning backwards through 
the file looking for a metainfo sync marker.  This scan could be accomplished 
via a number of ways, such as sequentially scanning backwards block at a time 
in fixed-size blocks or seeking much farther backwards in a larger chunk that 
is scanned forward in fixed-sized chunks then repeating if the marker is not 
found.

Isn't this a lot slower for the reader when it has to scan for the marker?  
Yep, it sure is.  However I would argue this is probably a rare occurrence in 
practice for two reasons:
# Logs are often written and never read
# Appending is a relatively rare and short-lived operation during the lifespan 
of a log file

By having the writer create the index file, we're essentially optimizing for 
this rare read-during-append case at the expense of making every writer more 
expensive.  Instead the sync marker approach optimizes for the much more common 
writing case, putting the load on the reader side if it happens to encounter a 
log file mid-append during a read operation.  I would argue that should be a 
relatively rare occurrence, and thus I'd rather optimize for the more common 
case.

Another alternative to the index file is using xattrs to associate the last 
good metainfo offset with the file.  However that still leads to approximately 
the same namenode write ops as the separate index file and requires special 
support on the underlying filesystem.  I'm not a fan of using xattrs myself, 
but I thought I'd mention it in the interest of covering the potential 
solutions.


> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2017-07-26 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16102302#comment-16102302
 ] 

Xuan Gong commented on YARN-6875:
-

Uploaded a design doc. Please review

> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org