[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2016-07-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15369770#comment-15369770
 ] 

Hudson commented on YARN-3908:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10074 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10074/])
YARN-3908. Fixed bugs in HBaseTimelineWriterImpl. Contributed by (sjlee: rev 
a9fab9b644e636c1f1b2632130d4eaea70111f16)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/Separator.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestHBaseTimelineWriterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/entity/EntityColumnPrefix.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/ColumnPrefix.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/HBaseTimelineWriterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/TimelineEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/entity/EntityTable.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/ColumnHelper.java


> Bugs in HBaseTimelineWriterImpl
> ---
>
> Key: YARN-3908
> URL: https://issues.apache.org/jira/browse/YARN-3908
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Vrushali C
> Fix For: YARN-2928
>
> Attachments: YARN-3908-YARN-2928.001.patch, 
> YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
> YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
> YARN-3908-YARN-2928.005.patch
>
>
> 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
> fields of a timeline entity plus events. However, entity#info map is not 
> stored at all.
> 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-27 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642860#comment-14642860
 ] 

Sangjin Lee commented on YARN-3908:
---

Thanks for the update [~vrushalic]. Are folks OK with this going in as is and 
making further changes to the event schema in a separate ticket? Let me know, 
and if everyone is fine, I'll merge this patch later today.

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-27 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643048#comment-14643048
 ] 

Li Lu commented on YARN-3908:
-

Hi [~sjlee0], I'm OK with checking this in and address the event schema problem 
in a separate JIRA. 

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-27 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643113#comment-14643113
 ] 

Zhijie Shen commented on YARN-3908:
---

[~vrushalic], thanks for fixing the problem. W.R.T the column key, shall we use:
{code}
e!eventId?eventTimestamp?eventInfoKey : eventInfoValue 
{code}

Image we have two KILL events: one on TS1 and the other on TS2. IMHO, we want 
to scan through the two events' columns one-by-one instead of in a interleaved 
manner. This will make reader to parse multiple events much easier and 
encapsulate them one after the other. It will be more useful in the future if 
we want to just retrieve part of the events of a big job (e.g. within a given 
time window or the most recent events). Thoughts?

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-27 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643188#comment-14643188
 ] 

Vrushali C commented on YARN-3908:
--

Hi Zhijie

Thanks.. that is a good point. But if we put the event timestamp first, we have 
no way of querying for a particular event key unless we know the exact 
timestamp. I think knowing the exact time is probably almost impossible. 

Imagine that there is another event that occurs between the two kill events, so 
it has a timestamp  kill1 and  kill2. Now we still have to fetch all those 
and filter them out. So placing the timestamp first does not help in this case. 
But if we have the event key first, the columns will be placed together and the 
event timestamps will be stored in a chronological order (using the long.max - 
ts value). So the first one being fetched for kill event would be the latest 
for that event key. 

thanks
Vrushali




 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-27 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643267#comment-14643267
 ] 

Zhijie Shen commented on YARN-3908:
---

Okay, it's fair point. It seems that the key design significantly depends on 
how we want to operate on the events. The current key design is most friendly 
to check if there exists the events who match the given event ID to match some 
given info key (and its value). But if you want to fetch everything that 
belongs to this event (our query needs to do this, as it's implicitly an atomic 
unit for now), it seems to be inevitable to scan through all these columns that 
have the given event ID (correct me if I'm wrong :-). If so, there seems to to 
have little gain from this key design, while complicating the event 
encapsulation logic.

And after rethinking of the current query to support (YARN-3051), I want to 
amend my suggestion. It seems to be more reasonable to use 
{{e!eventTimestamp?eventId?eventInfoKey}}, such that we can natively scan 
through the events of one entity one-by-one return them in a chronological 
order.

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-27 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643446#comment-14643446
 ] 

Vrushali C commented on YARN-3908:
--

Yes, let's get the current patch in and continue discussion on the event schema

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-27 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643421#comment-14643421
 ] 

Li Lu commented on YARN-3908:
-

Just a quick check about the current status of this JIRA. Are we still planning 
to merge it in ASAP, or we want to fix the row key of timeline events with one 
more draft, or we plan to fully resolve timeline event problems before we merge 
it in (if fixing the row key does not fully resolve the problem)? I'd like to 
know our plan on this JIRA so that I can fine tune my patch for YARN-3904 
accordingly. Thanks! 

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-27 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643476#comment-14643476
 ] 

Zhijie Shen commented on YARN-3908:
---

Sure, as most folks are comfortable with the latest patch, let's get this in. 
I'll file a separate jira to track the discussion about event column key.

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-27 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643424#comment-14643424
 ] 

Sangjin Lee commented on YARN-3908:
---

+1 for committing this patch and having the event schema discussion in a 
different JIRA.

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-24 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641026#comment-14641026
 ] 

Vrushali C commented on YARN-3908:
--

Thanks everyone for the discussion. I will upload another patch on this soon. 
Sangjin, Joep and I also had some more offline discussions on this over the 
last few days. We considered two options:
1. store the event timestamp as the hbase cell timestamp.
2. store the event timestamp as part of the column key.

In the first approach, it is easier to query for time range queries, for 
example, give me the events that occurred in this time range. The column names 
look cleaner too. The downside of the first approach is that, we need to setup 
the column family info to keep multiple versions and ensure other columns than 
the event columns don't store multiple versions, which is not a very clean way 
to store it. Yet another option is to store event information in the metrics 
family but that does not actually belong in that column family, so we are 
mixing things, which will make it harder while aggregating metrics.  

So based on these points, we plan to go with approach #2 : storing the event 
timestamp as part of the column key. I will be making some changes to this 
patch accordingly. The event information will be stored in the info column 
family. The timestamps will be part of the column name. So  it will be stored 
as: 
{code} e!eventId?eventInfoKey?eventTimestamp : eventInfoValue {code}

For reader:
There is a {code} org.apache.hadoop.hbase.filter.ColumnPrefixFilter {code} 
which can be used to scan specific column keys. Wrt to chronological ordering, 
there needs to be some filtering in the reader code to pick the event info 
key-values that belong to the latest timestamp. 

For example, in the eg given by [~zjshen] above:
b.q.  i think proper logic is: if we put event1, ts1 and event1, ts2, we 
should have two separate records persisted; and if we put event1, ts1, info: 
[k1=v1, k2=v2] and event1, ts1, info: [k1=v1'] again, we should update the 
same record and let k1=v1'.
Yes, this will be stored as you describe. But, for reading, we will get back 
all values that belong to all event timestamps since they will be part of the 
column key , so now reader needs to know which ones to return.



 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-24 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641114#comment-14641114
 ] 

Joep Rottinghuis commented on YARN-3908:


It would be good to get the patch for YARN-3908 checked in (and then continue 
further discussion if needed).
Otherwise we'll end up having more and more patches in flight that all end up 
modifying the same code and that will require merges.

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-24 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641211#comment-14641211
 ] 

Vrushali C commented on YARN-3908:
--

Sounds good, the current patch can be checked in and I can continue working on 
it later.. 

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-24 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641136#comment-14641136
 ] 

Li Lu commented on YARN-3908:
-

bq. It would be good to get the patch for YARN-3908 checked in (and then 
continue further discussion if needed).
I agree. After checking this in we can rebase all ongoing patches for the 
storage implementations. Specifically, I'd like to wait this patch to go in 
before I upload a patch in YARN-3904, which handles both online and offline 
storage implementations. 

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-22 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637300#comment-14637300
 ] 

Zhijie Shen commented on YARN-3908:
---

bq. Is it the event id + timestamp? How about the event type? If you look at 
the equals() and the hashCode() implementations of TimelineEvent, it uses the 
timestamp, the event type, and even the info as a whole, but the id is not used 
for equality. How does that square with the stated intent that the event id and 
the timestamp form the identity?

There's no event type now. In v1, it's called type, but in v2 is renamed to id. 
We want to use id + ts to identify an event object uniquely to support the case 
that an event happens multiple times. And we can avoid the combination ID like 
container_allocation_13421543243. Does this make sense?

bq. Is pretty much the only access pattern give me all the events that belong 
to this entity?

Yeah, get the events in chronological order of one entity, or just getting part 
of them via filtering.

bq. Two TimelineEvents are equal only if the timestamp is equal AND the type is 
equal AND the entire info maps are equal. What would we query by event type, 
timestamp and event info key? Do users always have to specify the timestamp?

There's no type, but only ID. In the current reader API, we cannot do 
sub-entity filtering, but in the future, we can try to support , for example, 
getting the events in a given time window. If two event has the same id, ts, 
but different info, we may consider them as the same event, but carry different 
information. The latter put one will append more k/v pairs or update the 
existing ones.

bq. Do we need to store only the latest event for each timestamp, or all of 
them? It would almost sound like the key should be type and timestamp, but what 
about the entire event info map?

In DB, i think proper logic is: if we put event1, ts1 and event1, ts2, we 
should have two separate records persisted; and if we put event1, ts1, info: 
\[k1=v1, k2=v2\] and event1, ts1, info: \[k1=v1'\] again, we should update 
the same record and let k1=v1'.



 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-22 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636399#comment-14636399
 ] 

Joep Rottinghuis commented on YARN-3908:


What does this mean for the HBaseTimelineWriterImpl?

Two TimelineEvents are equal only if the timestamp is equal AND the type is 
equal AND the entire info maps are equal.
What would we query by event type, timestamp and event info key? Do users 
always have to specify the timestamp?
Do we need to store only the latest event for each timestamp, or all of them?
The answer to all these questions are critical for the key design for the 
TimelineEvent.
It would almost sound like the key should be type and timestamp, but what about 
the entire event info map?

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-21 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635754#comment-14635754
 ] 

Sangjin Lee commented on YARN-3908:
---

I filed YARN-3949 to address the need for timely flush of writes.

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-21 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635996#comment-14635996
 ] 

Sangjin Lee commented on YARN-3908:
---

[~jrottinghuis], [~vrushalic], and I had offline chats, and we feel that we may 
need to revisit how we store events.

Currently (with this patch) we store the event with the column name 
e!eventId?infoKey and the column value being the info value. The event 
timestamp is stored as the cell timestamp. We're realizing that this may not be 
a correct way to store events.

I'm basing this on the 
[discussion|https://issues.apache.org/jira/browse/YARN-3836?focusedCommentId=14619729page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14619729]
 we had when we talked about the equality and identity semantics of 
{{TimelineEvent}}. Namely, the id *and* the timestamp form the identity of a 
{{TimelineEvent}}. Then I think storing the timestamp in the HBase cell 
timestamp does not work.

Some questions for you, [~zjshen] and [~gtCarrera9].

(1) *What defines the identity of a {{TimelineEvent}}?*
Is it the event id + timestamp? How about the event type? If you look at the 
{{equals()}} and the {{hashCode()}} implementations of {{TimelineEvent}}, it 
uses the timestamp, the event type, and even the info as a whole, but the id is 
not used for equality. How does that square with the stated intent that the 
event id and the timestamp form the identity?

(2) *What would be the access pattern* for {{TimelineEvents}}?*
Is pretty much the only access pattern give me all the events that belong to 
this entity?

Also specifically, would you ever query for an event with the id *and* the 
timestamp? It is not reasonable for readers to be able to provide the event 
timestamp for queries, right?

Would you also query for just the event id? What other access patterns need to 
be supported?

Clarifying those things would help us correctly implement the schema. Thanks!

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-21 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636009#comment-14636009
 ] 

Li Lu commented on YARN-3908:
-

Hi [~sjlee0], I don't think we're still using event type in new TimelineEvent 
v2. However, the behavior you mentioned is quite consistent with the v1 
TimelineEvent. Could you please double check this? Thanks! 

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-21 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636050#comment-14636050
 ] 

Sangjin Lee commented on YARN-3908:
---

Hmm, then isn't this incorrect?

{code:title=TimelineEvent.java|borderStyle=solid}
  @Override
  public int compareTo(TimelineEvent other) {
if (timestamp  other.timestamp) {
  return -1;
} else if (timestamp  other.timestamp) {
  return 1;
} else {
  return eventType.compareTo(other.eventType);
}
  }

  @Override
  public boolean equals(Object o) {
if (this == o)
  return true;
if (o == null || getClass() != o.getClass())
  return false;

TimelineEvent event = (TimelineEvent) o;

if (timestamp != event.timestamp)
  return false;
if (!eventType.equals(event.eventType))
  return false;
if (eventInfo != null ? !eventInfo.equals(event.eventInfo) :
event.eventInfo != null)
  return false;

return true;
  }

  @Override
  public int hashCode() {
int result = (int) (timestamp ^ (timestamp  32));
result = 31 * result + eventType.hashCode();
result = 31 * result + (eventInfo != null ? eventInfo.hashCode() : 0);
return result;
  }
{code}

First of all, id is not even used. Instead type is used. Also, event info is 
part of the equality semantics.

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-21 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636058#comment-14636058
 ] 

Li Lu commented on YARN-3908:
-

Hi [~sjlee0], I think the code you posted here belongs to timeline v1 
(o.a.h.yarn.api.records.timeline.*), but the v2 version is in 
o.a.h.yarn.api.records.timelineservice.*. TimelineEvent in v2, modified in 
YARN-3836, does use id for all related tasks. We're no longer using event info 
for equality check in that version. 

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-21 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636057#comment-14636057
 ] 

Sangjin Lee commented on YARN-3908:
---

Sorry my bad. I mistakenly pulled up the v.1 of {{TimelineEvent}}. Our version 
uses only the id and the timestamp for equality:

{code:title=TimelineEvent.java|borderStyle=solid}
  @Override
  public int hashCode() {
int result = (int) (timestamp ^ (timestamp  32));
result = 31 * result + id.hashCode();
return result;
  }

  @Override
  public boolean equals(Object o) {
if (this == o)
  return true;
if (!(o instanceof TimelineEvent))
  return false;

TimelineEvent event = (TimelineEvent) o;

if (timestamp != event.timestamp)
  return false;
if (!id.equals(event.id)) {
  return false;
}
return true;
  }

  @Override
  public int compareTo(TimelineEvent other) {
if (timestamp  other.timestamp) {
  return -1;
} else if (timestamp  other.timestamp) {
  return 1;
} else {
  return id.compareTo(other.id);
}
  }
{code}

So that answers my first question. Sorry for the confusion! Only the second 
question remains...

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-20 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633937#comment-14633937
 ] 

Sangjin Lee commented on YARN-3908:
---

bq. 1. The events have been written into metrics column family.

Correct, that was another bug, and it's been corrected with the patch. Have you 
tried the patch to see if the problem has been fixed?

bq. 2. The entity is not accessible immediately after a single put operation.

Could you kindly elaborate bit more on under what condition this happens? IIUC, 
the current mechanism is using the buffered mutator 
(http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/BufferedMutator.html),
 and the writes are flushed to the HBase table in a batch asynchronous manner. 
Perhaps you're encountering that behavior?


 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-20 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633936#comment-14633936
 ] 

Sangjin Lee commented on YARN-3908:
---

The JIRA site seems offline right now.

bq. 1. The events have been written into metrics column family.

Correct, that was another bug, and it's been corrected with the patch. Have
you tried the patch to see if the problem has been fixed?

bq. 2. The entity is not accessible immediately after a single put
operation.

Could you kindly elaborate bit more on under what condition this happens?
IIUC, the current mechanism is using the buffered mutator (
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/BufferedMutator.html),
and the writes are flushed to the HBase table in a batch asynchronous
manner. Perhaps you're encountering that behavior?


On Mon, Jul 20, 2015 at 10:21 AM, Zhijie Shen (JIRA) j...@apache.org



 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-20 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633857#comment-14633857
 ] 

Zhijie Shen commented on YARN-3908:
---

I found two more issues upon debugging the reader POC:

1. The events have been written into metrics column family.

2. The entity is not accessible immediately after a single put operation. 

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-20 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634383#comment-14634383
 ] 

Sangjin Lee commented on YARN-3908:
---

I'd like to propose closing this issue with the current patch (provided 
everyone is happy with the patch of course) with the understanding that the 
patch fixes all the issues mentioned here with the exception of the flush 
behavior. I do think handling the flush might be more far reaching than the 
changes that are being done here, so I'm inclined to create a separate ticket 
for it. Let me know what you think.

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-20 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634200#comment-14634200
 ] 

Zhijie Shen commented on YARN-3908:
---

I set up hbase-1.0.1.1 as a single node cluster on local FS, submit an MR job, 
after job got finished, I used the REST API (YARN-3049) to read the entity - 
NOT FOUND and I used hbase shell to scan through the entity table - NOT FOUND 
as well.

We may want to rethink of the buffer policy. It seems not to be a good user 
experience that after app is finished, the entity is still not available to 
users.

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-20 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634293#comment-14634293
 ] 

Sangjin Lee commented on YARN-3908:
---

Currently we're not calling {{BufferedMutator.flush()}} explicitly. It sounds 
like that might be a good thing to do. [~jrottinghuis], [~vrushalic], thoughts 
on calling {{BufferedMutator.flush()}} at the end of the {{write()}} call? Did 
you consider doing that when you created the hbase writer?

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14632094#comment-14632094
 ] 

Hadoop QA commented on YARN-3908:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m  5s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m  3s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 55s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 18s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 28s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 40s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 21s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 22s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  43m  4s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12745903/YARN-3908-YARN-2928.005.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / eb1932d |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8578/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8578/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8578/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8578/console |


This message was automatically generated.

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
 YARN-3908-YARN-2928.005.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-17 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631826#comment-14631826
 ] 

Sangjin Lee commented on YARN-3908:
---

Thanks for the comment [~gtCarrera9]. I agree the name is bit awkward. Let me 
see if I can rename it to something more appropriate. Will update.

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-16 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630275#comment-14630275
 ] 

Joep Rottinghuis commented on YARN-3908:


bq. In fact, I'm wondering if we should but info and events into a separate 
column family like what we did for configs/metrics?

In principle we should keep everything in the same column family (fewer store 
files) unless:
a) The items that we store require a different TTL, compression, etc. This is 
the case for metrics where we need a separate TTL.
b) The columns are rather significant in size, and in many queries they'll be 
skipped (and specifically not used in push-down predicate ie. column value 
filters etc). This is the case for configuration. If we have many queries to 
just retrieve info fields and we skip configs in these, then iterating over 
just the rows in the info column family will have a benefit of not needing to 
access the config store files.

Otherwise a separate column family just results in more store files and doesn't 
really gain us anything.
Given the current code setup, switching column family is almost trivial, so 
given that there are no functionality differences,  I'd say let's not even try 
to further optimize this until we have way more code in place.
Then we can run large batches of historical job history files and other inputs 
(perhaps porting data from ATS v1) and then we can see the potential benefit or 
downside.

The other reason to not do premature optimization is that I'm still thinking of 
adding a few more perf tweaks. Those would also just be performance 
optimizations, and not any functionality different, so also not a priority now. 
We should look at tuning all those things much later and together in a coherent 
way. Additional settings that we need to test are RPC compression, encoding of 
the store files and/or compression of the same.

In short, let's focus on completing functionality and then tinker with these 
settings later. 

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-16 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630289#comment-14630289
 ] 

Joep Rottinghuis commented on YARN-3908:


Patch looks good with one comment. I completely overlooked the event info map, 
because it isn't part of the javadoc on the EntityTable. I should have 
double-checked but didn't. Thanks for catching this.

[~sjlee0] I think it would be good to update the javadoc that describes the 
EntityTable in the EntityTable.java file.
The same is probably missing from the doc Timeline service schema for native 
HBase tables (not sure which jira the PDF for that doc is attached to), 
because I copied it from the code. I don't think that the application table has 
been copied yet, so it won't be missing from there. 

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-16 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630769#comment-14630769
 ] 

Li Lu commented on YARN-3908:
-

Hi [~sjlee0], one nit: the name {{readTimeseriesResults}} looks a little bit 
confusing if used with timeline events ({{ 
EntityColumnPrefix.EVENT.readTimeseriesResults(result)}}). It looks to be 
related to the time series data within a timeline metric. Maybe we can change 
that into a more general name to accommodate the generic type V (or T)? 

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630479#comment-14630479
 ] 

Hadoop QA commented on YARN-3908:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  18m  5s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m 23s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 23s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 22s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 28s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 41s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 24s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 26s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 21s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  45m  2s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12745704/YARN-3908-YARN-2928.004.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / eb1932d |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8563/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8563/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8563/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8563/console |


This message was automatically generated.

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
 YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-13 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624973#comment-14624973
 ] 

Sangjin Lee commented on YARN-3908:
---

I would appreciate your review on this. Thanks!

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-13 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625369#comment-14625369
 ] 

Zhijie Shen commented on YARN-3908:
---

[~vrushalic] and [~sjlee0], thanks for helping fix the problems. I've two 
questions:

1. In fact, I'm wondering if we should but info and events into a separate 
column family like what we did for configs/metrics?

2. We don't want to store the metric type, do we?

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-13 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625387#comment-14625387
 ] 

Sangjin Lee commented on YARN-3908:
---

bq. 2. We don't want to store the metric type, do we?

Maybe I was mistaken when I read your comment that said I also realized that 
the metric type is not persisted too. I took it to mean that you're suggesting 
that we persist it. I also had an offline chat with [~vrushalic], and she 
clarified that we probably do not need to persist the metric type and that we 
can distinguish between a single value metric vs. time series as you described.

We still need to make some changes to ensure that the hbase writer sets the 
right min version when it writes a single-value metric (currently it's not 
being done).

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-13 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625397#comment-14625397
 ] 

Zhijie Shen commented on YARN-3908:
---

Yeah, but the method based on metric value number is not guaranteed, are we 
okay with it?

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-13 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625414#comment-14625414
 ] 

Sangjin Lee commented on YARN-3908:
---

When a time series data expires after the TTL (except for the latest value), it 
will only contain a single value. For all practical purposes, the metric at 
that point would act like a single value. We thought that it would be fine.

Do you see a situation where (probably on the read path) we need to recognize 
some metric as a time series and do something different *even though* there is 
only one value in the column?

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-13 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625502#comment-14625502
 ] 

Sangjin Lee commented on YARN-3908:
---

bq. 1. In fact, I'm wondering if we should but info and events into a separate 
column family like what we did for configs/metrics?

[~vrushalic], [~jrottinghuis], your thoughts on this?

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621982#comment-14621982
 ] 

Hadoop QA commented on YARN-3908:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  15m 38s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 48s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 42s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 18s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 43s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 40s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 47s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   1m 22s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  38m 27s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12744663/YARN-3908-YARN-2928.001.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 2d4a8f4 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8494/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8494/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8494/console |


This message was automatically generated.

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14623180#comment-14623180
 ] 

Hadoop QA commented on YARN-3908:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m  3s | Pre-patch YARN-2928 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 48s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 44s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 23s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 40s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 23s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 21s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  42m 51s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12744857/YARN-3908-YARN-2928.003.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 2d4a8f4 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8505/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8505/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8505/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8505/console |


This message was automatically generated.

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14623133#comment-14623133
 ] 

Hadoop QA commented on YARN-3908:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  16m 52s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:red}-1{color} | javac |   7m 46s | The applied patch generated  1  
additional warning messages. |
| {color:green}+1{color} | javadoc |   9m 48s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 58s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 39s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 41s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 20s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 26s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 22s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  42m 19s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12744831/YARN-3908-YARN-2928.002.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 2d4a8f4 |
| javac | 
https://builds.apache.org/job/PreCommit-YARN-Build/8502/artifact/patchprocess/diffJavacWarnings.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8502/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8502/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8502/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8502/console |


This message was automatically generated.

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-10 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14622727#comment-14622727
 ] 

Sangjin Lee commented on YARN-3908:
---

Thanks [~vrushalic]! I'll look at it and see if more changes are needed. Other 
reviews are welcome too.

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-10 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14622987#comment-14622987
 ] 

Sangjin Lee commented on YARN-3908:
---

A couple of comments on the latest patch:

I find that essentially the event timestamp is stored in the same manner as a 
metric timestamp. So the mechanism of retrieving the event timestamp is nearly 
identical to {{ColumnPrefix.readTimeseriesResults()}}. The only restriction of 
the previous version of that method is that it explicitly cast the values to 
{{Number}}. In the patch I genericized the method to handle any value type.

Having said that, I recognize that reading an event using a method named 
{{readTimeseriesResults()}} is rather awkward. But alternatives are not great 
either. {{ColumnPrefix}} is an enum, and it is quite awkward to introduce a 
method that is useful for only one value of that enum. Perhaps we could create 
a static wrapper method to bridge that gap. Let me know what you think.

Finally, I find that we're still not persisting the metric types. I could be 
wrong but it appears that all metrics are treated as time series when they are 
stored. I'll see if it would be straightforward to implement that piece, but it 
could be bit involved. How about capturing that in its own subtask?

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C
 Attachments: YARN-3908-YARN-2928.001.patch, 
 YARN-3908-YARN-2928.002.patch


 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-09 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621592#comment-14621592
 ] 

Zhijie Shen commented on YARN-3908:
---

1. TimelineEvent has a timestamp associated with it. It tells us when the event 
happened. We should have this information persisted, but unfortunately it seems 
not.

2. Metric doesn't have a timestamp because the timestamp is associated with 
each individual value.

3. I also realized that the metric type is not persisted too. Now I just assume 
if size(metric)  1 = time series, else = single value in reader 
implementation. But it may not be guaranteed.


 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C

 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-09 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621571#comment-14621571
 ] 

Vrushali C commented on YARN-3908:
--

Hi [~zjshen]

I see that event#info is not being stored, but which is the event timestamp 
that is being referred? Event metrics does store the timestamp per metric. 

(Also, I will be on vacation starting tomorrow through next week, so checking 
with Sangjin offline about this.). 
thanks
Vrushali

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C

 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl

2015-07-09 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621480#comment-14621480
 ] 

Zhijie Shen commented on YARN-3908:
---

Assign it to [~vrushalic] by default. Please feel free to rebalance the 
workload.

 Bugs in HBaseTimelineWriterImpl
 ---

 Key: YARN-3908
 URL: https://issues.apache.org/jira/browse/YARN-3908
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Vrushali C

 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
 fields of a timeline entity plus events. However, entity#info map is not 
 stored at all.
 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)