Agreed. I could also contribute to that doc. On Sat, Nov 5, 2016, 11:41 Kyle Richardson <[email protected]> wrote:
> Thanks, James. Very helpful information. Based on that, I agree the path is > there and I have no issues with it being manual at this point. I would > suggest we add a simple UPGRADING.md outining the steps you have with a > little more detail to make it easy for the user. I'd be happy to take this > on if folks agree it would be useful. > > -Kyle > > On Sat, Nov 5, 2016 at 7:56 AM, Casey Stella <[email protected]> wrote: > > > I agree. I think the upgrade path is clear however manual right now. > Going > > forward we will need to prioritize making it more automated, but I think > > the path is there. > > > > On Sat, Nov 5, 2016 at 00:26 James Sirota <[email protected]> wrote: > > > > > Hi Kyle, > > > > > > The HDP upgrade guide can be found here: > > > > > > https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/ > > bk_command-line-upgrade/content/ch_upgrade_2_4.html > > > > > > After executing these instructions you get to HDP 2.5 with no data > loss. > > > After that, upgrading Metron is as simple as saving the old configs, ES > > > templates, grok statements from HDFS, and NiFi flows from your 0.2.1 > > build, > > > installing 0.2.2 (via Ambari management pack), and putting the configs > > back > > > into zookeeper, copying the ES templates and Grok files back, and > > > restarting your NiFi flows. I agree that we should automate most of > this > > > eventually, and we will, but I don't think this is necessarily a show > > > stopper for dropping BETA. Would you agree? > > > > > > Thanks, > > > James > > > > > > 04.11.2016, 18:27, "Kyle Richardson" <[email protected]>: > > > > I'm a little late to the party but thought I would go ahead and throw > > my > > > > two cents into the mix. > > > > > > > > I share the concern around an upgrade / migration path. While I would > > > love > > > > to see the BETA dropped sooner than later, to me, this is a game > > changer > > > > for people implementing Metron. I think there is a silent expectation > > of > > > no > > > > data loss after dropping the BETA tag. > > > > > > > > Even if there is not a direct upgrade path for a few releases, is > there > > > > documentation that we could provide to ensure a data migration path > for > > > > users? I'm not thinking anything automated just some instructions on > > what > > > > to do. > > > > > > > > -Kyle > > > > > > > > On Fri, Nov 4, 2016 at 9:16 AM, Casey Stella <[email protected]> > > wrote: > > > > > > > >> Jon, > > > >> > > > >> Thank you for your thoughts; they are appreciated and you should > keep > > > them > > > >> coming. This kind of discussion is exactly why I sent out this > > thread. > > > I > > > >> think it's safe to say that the entire community shares your desire > > for > > > >> Metron to be as easy to use as possible and a "data analysis > platform > > > for > > > >> the masses." We should hold ourselves to a high standard, no doubt. > > > >> > > > >> Casey > > > >> > > > >> On Fri, Nov 4, 2016 at 6:30 AM, [email protected] <[email protected] > > > > > wrote: > > > >> > > > >> > Please understand that my points mostly relate to perception and > > > ease of > > > >> > use, not what's technically possible or available. I'm coming at > > > this as > > > >> > Metron should be a data analysis platform for the masses. > > > >> > > > > >> > METRON-517/542 - While I'm willing to let this one go it depends > on > > > your > > > >> > definition of non-issue. I personally believe that data (in every > > > >> location > > > >> > that it exists) needs to be obvious and have ultra high > integrity. > > > I'm > > > >> not > > > >> > concerned that the correct data won't exist somewhere in the > > > cluster, I'm > > > >> > focusing on it being easily accessible by an operations team that > > may > > > >> > consist of entry level analysts. Once 517 is done and merged I > > would > > > >> > consider that a short term mitigation is in place. > > > >> > > > > >> > I feel like the project should stick to certain principles and a > > > >> suggestion > > > >> > is that data access is easy, accurate, and obvious. Do we have > > > anything > > > >> > like this that was agreed upon, discussed, or documented? > Probably > > a > > > >> > discussion for a different thread. > > > >> > > > > >> > METRON-485/470/etc. were mostly to illustrate a consistency issue > > > that > > > >> and > > > >> > resolving them would give a better first impression (assuming > that > > > people > > > >> > monitoring the project will start using it more once it's > non-BETA > > > >> > software). First impressions are big on my book and could affect > > > initial > > > >> > adoption. > > > >> > > > > >> > Regarding 485 - Otto may be able to clarify but I thought > somebody > > > else > > > >> saw > > > >> > this issue as well. I think the finger is currently being pointed > > at > > > >> monit > > > >> > timeouts and not storm. It also doesn't happen every single > time, I > > > only > > > >> > run into it while the cluster is under load and after dozens of > > > topology > > > >> > restarts that I do when tuning parallelism in storm. I'm going to > > be > > > >> > updating to storm 1.0.x in order to see if this still exists. > > Again, > > > >> this > > > >> > relates to ease of use/load testing/tuning. > > > >> > > > > >> > Agree with the upgrade comments - as long as it's supported at > some > > > >> defined > > > >> > point (IMHO this is when a project leaves BETA but others are > > > welcome to > > > >> > disagree). > > > >> > > > > >> > Finally, I know this doesn't come across well in email but I'm > just > > > >> > mentioning items which I think are important, not attempting to > > > demand > > > >> that > > > >> > they be fixed or that this doesn't leave beta. Thanks, > > > >> > > > > >> > Jon > > > >> > > > > >> > On Thu, Nov 3, 2016, 16:44 James Sirota <[email protected]> > > wrote: > > > >> > > > > >> > > > > >> > Hi Jon, > > > >> > > > > >> > Here are my thoughts around your objections. > > > >> > > > > >> > METRON-517/METRON-542 > > > >> > > > > >> > I thin the mechanism currently exists within Metron to make this > a > > > >> > non-issue. I believe you can solve it with a combination of a > > Stellar > > > >> > statement and ES templates. As you mentioned, we can truncate the > > > string > > > >> > and then include the relevant meta data in the message (original > > > length, > > > >> > hash, etc). Cramming really long strings into ES is generally a > bad > > > >> thing, > > > >> > which is why this limitation exists. The metadata in the indexed > > > >> message > > > >> > along with the timestamp allows you to pull data from HDFS should > > you > > > >> need > > > >> > to recover the full string. > > > >> > > > > >> > METRON-485 > > > >> > > > > >> > We cannot replicate this issue in our environment, but if this is > > > indeed > > > >> an > > > >> > issue this is an issue with Storm. A Jira should be filed against > > > Storm > > > >> > and not against Metron. My hunch, though, is that it's probably > > > >> something > > > >> > in your environment. I just tried stopping all topologies on my > AWS > > > >> > cluster and then went to all Storm nodes and didn't see any > workers > > > left > > > >> > behind. > > > >> > > > > >> > METRON-470 > > > >> > > > > >> > I think this is mainly a consistency issue. I don't think this > > > impacts > > > >> the > > > >> > stability or function of the software. I think this is a nice to > > > have, > > > >> > maybe in the next few releases, but I don't think we absolutely > > have > > > to > > > >> > have this to drop BETA > > > >> > > > > >> > With respect to upgrades, here are my thoughts. There is really > no > > > way > > > >> to > > > >> > upgrade Metron 0.2.1 to Metron 0.2.2 in place because it > requires a > > > >> change > > > >> > of HDP. The new build will only be compatible with HDP 2.5 and > not > > > 2.4. > > > >> > So you have to lay down a new cluster regardless. We can document > > > how to > > > >> > get the configs off of your old Metron and plug them into your > new > > > Metron > > > >> > so that it works the same. That shouldn't be a problem. > > > >> > > > > >> > Our upgrade path for future releases will revolve around the > Ambari > > > >> Metron > > > >> > management pack that is available with the upcoming build. Right > > now > > > the > > > >> > install capability is available and the upgrade capability will > > come > > > in > > > >> > incrementally within the next few release. We will additionally > > > >> deprecate > > > >> > Monit and switch that functionality to Ambari as well. Finally, > we > > > will > > > >> > also use Ambari for metrics monitoring. There is lots to do so we > > > will > > > >> > triage and prioritize Jiras as a community to see which parts we > > > want to > > > >> > tackle first. This is why your participation in the community is > so > > > >> > valuable. > > > >> > > > > >> > Thanks, > > > >> > James > > > >> > > > > >> > > > > >> > > > > >> > 03.11.2016, 11:07, "[email protected]" <[email protected]>: > > > >> > > I agree that we can split METRON-517 into a short term and long > > > term > > > >> fix. > > > >> > > I have attempted to organize my thoughts regarding the long > term > > > fix > > > >> into > > > >> > > METRON-542 and can get a PR out for METRON-517 soon to close > that > > > out. > > > >> > > > > > >> > > This leaves cluster tuning and a valid upgrade path for users, > > the > > > >> latter > > > >> > of > > > >> > > which is my predominant concern. If the team is willing to say > > that > > > >> > > starting with 0.2.2 there will be a valid upgrade path to > future > > > >> releases > > > >> > I > > > >> > > think that removing the BETA tag at 0.2.2 is reasonable. That > > said, > > > >> this > > > >> > > is just following my perception of what the BETA tag > represents. > > > >> > > > > > >> > > Jon > > > >> > > > > > >> > > On Thu, Nov 3, 2016 at 11:50 AM Casey Stella < > [email protected] > > > > > > >> wrote: > > > >> > > > > > >> > >> Ok, regarding METRON-517, I've thought about this a bit having > > > read > > > >> > your > > > >> > >> really great and detailed JIRA as well as the discussion > around > > > this > > > >> on > > > >> > the > > > >> > >> dev list between you and Matt Foley. I want to separate the > > > >> discussion > > > >> > >> between what is the correct long-term solution for this issue > > > versus > > > >> > what > > > >> > >> is an acceptable solution. > > > >> > >> > > > >> > >> In terms of an acceptable work-around, my opinion is that > > because > > > we > > > >> > allow > > > >> > >> the user to modify the ES template they can > > > >> > >> > > > >> > >> - Adjust the template to specify ignore_above > > > >> > >> < > > > >> > >> > > > >> > https://www.elastic.co/guide/en/elasticsearch/reference/ > > > >> > current/ignore-above.html > > > >> > >> > > > > >> > >> on > > > >> > >> fields which they feel are likely to be large (maybe every > > string > > > >> > field) > > > >> > >> - The combination of timestamp and ip_src_addr should be > > > >> sufficient > > > >> > for > > > >> > >> picking out the raw data in question from the HDFS store > > > >> > >> - A stellar enrichment can be used to tag the messages with > > large > > > >> > URIs > > > >> > >> and that can factor into the threat triage even or be used to > > > >> filter > > > >> > in > > > >> > >> kibana > > > >> > >> - As you say, you can use the profiler to track counts of such > > > >> > messages > > > >> > >> if you so desire and factor that into threat alerting or > > filtering > > > >> > in > > > >> > >> kibana. > > > >> > >> > > > >> > >> Ultimately, I believe we have exposed the appropriate set of > > > tooling > > > >> to > > > >> > >> provide an acceptable solution for the moment. Now, as for the > > > best > > > >> > >> long-term solution, I will let the good discussion on the > > mailing > > > >> list > > > >> > and > > > >> > >> JIRA continue and contribute my thoughts on the JIRA > > > >> > >> <https://issues.apache.org/jira/browse/METRON-517>. > > > >> > >> > > > >> > >> Of course, this is just $0.02 :) > > > >> > >> > > > >> > >> Apologies to Dave, I wanted to mark this aspect of the > > discussion > > > on > > > >> > this > > > >> > >> thread as it is relevant to sufficient criteria to remove the > > BETA > > > >> tag. > > > >> > >> > > > >> > >> Best, > > > >> > >> > > > >> > >> Casey > > > >> > >> > > > >> > >> On Thu, Nov 3, 2016 at 7:26 AM, [email protected] < > > > [email protected]> > > > >> > wrote: > > > >> > >> > > > >> > >> > To clarify, it only needs to truncate fields > 32766 which > > need > > > a > > > >> > >> > full/exact string match search to be run on them (analyzed > > > fields > > > >> > >> generally > > > >> > >> > would not hit this limitation but I guess in theory they > > could). > > > >> > >> However, > > > >> > >> > that's probably every field which can get > 32766 because > I'm > > > >> > assuming > > > >> > >> > those will all be strings. > > > >> > >> > > > > >> > >> > I also think using the profiler to monitor the truncation > > action > > > >> > could > > > >> > >> be a > > > >> > >> > useful default. > > > >> > >> > > > > >> > >> > Jon > > > >> > >> > > > > >> > >> > On Wed, Nov 2, 2016, 21:08 [email protected] < > [email protected] > > > > > > >> > wrote: > > > >> > >> > > > > >> > >> > > That would break searching on uri entirely unless you > > queried > > > and > > > >> > knew > > > >> > >> to > > > >> > >> > > truncate at 32766 because it's not analyzed. I don't like > > > pushing > > > >> > that > > > >> > >> > > complication to the end user. > > > >> > >> > > > > > >> > >> > > I would suggest truncation in the indexingBolt (not using > > > stellar > > > >> > >> because > > > >> > >> > > you'd want this across the board) for all fields > 32766 > > (how > > > do > > > >> we > > > >> > >> make > > > >> > >> > > sure this gets updated if the limitation changes in > Lucene?) > > > and > > > >> > adding > > > >> > >> > > metadata key-value pairs (pre-trunc length, hash, > truncated > > > bool, > > > >> > >> etc.). > > > >> > >> > > In the URI scenario I would also suggest doing a > multifield > > > >> mapping > > > >> > by > > > >> > >> > > default because of the way that data is useful (not sure > > which > > > >> > analyser > > > >> > >> > to > > > >> > >> > > use though - maybe write or find a good URI analyzer?). > > Since > > > >> > >> timestamp > > > >> > >> > is > > > >> > >> > > a required field for all messages (I'm pretty sure?) I'm > ok > > > with > > > >> > >> > timestamp > > > >> > >> > > and field value used as the UID, but would prefer > something > > > >> better. > > > >> > >> > > > > > >> > >> > > Jon > > > >> > >> > > > > > >> > >> > > On Wed, Nov 2, 2016, 20:33 James Sirota < > [email protected] > > > > > > >> > wrote: > > > >> > >> > > > > > >> > >> > > Jon, > > > >> > >> > > > > > >> > >> > > For METRON-517 would it suffice to have a stellar > statement > > to > > > >> take > > > >> > a > > > >> > >> URI > > > >> > >> > > string and truncate it to length of 32766 in the ES > writer? > > > But > > > >> > still > > > >> > >> > > write the actual string to HDFS? You can then search > against > > > ES > > > >> on > > > >> > the > > > >> > >> > > truncated portion, but retrieve the actual timestamp from > > > HDFS. > > > >> > It's > > > >> > >> > easy > > > >> > >> > > to do because you know the timestamp from the original > > > message. > > > >> So > > > >> > you > > > >> > >> > > know which logs in HDFS to search through to find the > data. > > > >> > >> > > > > > >> > >> > > 02.11.2016, 14:12, "[email protected]" <[email protected]>: > > > >> > >> > > > I personally would like to see the following things done > > > before > > > >> > >> things > > > >> > >> > > > leave BETA: > > > >> > >> > > > (1) Address data integrity concerns (Specifically > thinking > > > of > > > >> > >> > METRON-370, > > > >> > >> > > > METRON-517) > > > >> > >> > > > (2) Make cluster tuning easier and more consistent > > > (METRON-485, > > > >> > >> > > METRON-470, > > > >> > >> > > > and the "[DISCUSS] moving parsers back to flux" which I > > > can't > > > >> > find a > > > >> > >> > JIRA > > > >> > >> > > > for). > > > >> > >> > > > > > > >> > >> > > > I would also want to see the upgrade path (as opposed to > > > >> rebuild) > > > >> > be > > > >> > >> > more > > > >> > >> > > > thoroughly and regularly tested once things leave BETA. > > > From my > > > >> > >> > > > perspective I think the project is very close but not > yet > > > >> ready. > > > >> > >> > > > > > > >> > >> > > > Jon > > > >> > >> > > > > > > >> > >> > > > On Wed, Nov 2, 2016 at 4:44 PM Casey Stella < > > > >> [email protected]> > > > >> > >> > wrote: > > > >> > >> > > > > > > >> > >> > > > Hello Everyone, > > > >> > >> > > > > > > >> > >> > > > Now that the discussion around the next release has > > > started, it > > > >> > has > > > >> > >> > been > > > >> > >> > > > proposed and I think it's a good time to discuss what to > > > name > > > >> > this > > > >> > >> next > > > >> > >> > > > release. Before, we have adopted the BETA suffix. I > think > > it > > > >> > might be > > > >> > >> > > > time to drop it and call the next release 0.2.2 > > > >> > >> > > > > > > >> > >> > > > Thoughts? > > > >> > >> > > > > > > >> > >> > > > Best, > > > >> > >> > > > > > > >> > >> > > > Casey > > > >> > >> > > > > > > >> > >> > > > -- > > > >> > >> > > > > > > >> > >> > > > Jon > > > >> > >> > > > > > >> > >> > > ------------------- > > > >> > >> > > Thank you, > > > >> > >> > > > > > >> > >> > > James Sirota > > > >> > >> > > PPMC- Apache Metron (Incubating) > > > >> > >> > > jsirota AT apache DOT org > > > >> > >> > > > > > >> > >> > > -- > > > >> > >> > > > > > >> > >> > > Jon > > > >> > >> > > > > > >> > >> > -- > > > >> > >> > > > > >> > >> > Jon > > > >> > >> > > > > >> > > -- > > > >> > > > > > >> > > Jon > > > >> > > > > >> > ------------------- > > > >> > Thank you, > > > >> > > > > >> > James Sirota > > > >> > PPMC- Apache Metron (Incubating) > > > >> > jsirota AT apache DOT org > > > >> > > > > >> > -- > > > >> > > > > >> > Jon > > > >> > > > > > > > ------------------- > > > Thank you, > > > > > > James Sirota > > > PPMC- Apache Metron (Incubating) > > > jsirota AT apache DOT org > > > > > > -- Jon Sent from my mobile device
