Hello James, Does that mean Metron 0.2.2 goes with HDP 2.5 by default?
- Dima On 11/05/2016 06:26 AM, James Sirota wrote: > Hi Kyle, > > The HDP upgrade guide can be found here: > https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_command-line-upgrade/content/ch_upgrade_2_4.html > > After executing these instructions you get to HDP 2.5 with no data loss. > After that, upgrading Metron is as simple as saving the old configs, ES > templates, grok statements from HDFS, and NiFi flows from your 0.2.1 build, > installing 0.2.2 (via Ambari management pack), and putting the configs back > into zookeeper, copying the ES templates and Grok files back, and restarting > your NiFi flows. I agree that we should automate most of this eventually, > and we will, but I don't think this is necessarily a show stopper for > dropping BETA. Would you agree? > > Thanks, > James > > 04.11.2016, 18:27, "Kyle Richardson" <[email protected]>: >> I'm a little late to the party but thought I would go ahead and throw my >> two cents into the mix. >> >> I share the concern around an upgrade / migration path. While I would love >> to see the BETA dropped sooner than later, to me, this is a game changer >> for people implementing Metron. I think there is a silent expectation of no >> data loss after dropping the BETA tag. >> >> Even if there is not a direct upgrade path for a few releases, is there >> documentation that we could provide to ensure a data migration path for >> users? I'm not thinking anything automated just some instructions on what >> to do. >> >> -Kyle >> >> On Fri, Nov 4, 2016 at 9:16 AM, Casey Stella <[email protected]> wrote: >> >>> Jon, >>> >>> Thank you for your thoughts; they are appreciated and you should keep them >>> coming. This kind of discussion is exactly why I sent out this thread. I >>> think it's safe to say that the entire community shares your desire for >>> Metron to be as easy to use as possible and a "data analysis platform for >>> the masses." We should hold ourselves to a high standard, no doubt. >>> >>> Casey >>> >>> On Fri, Nov 4, 2016 at 6:30 AM, [email protected] <[email protected]> wrote: >>> >>> > Please understand that my points mostly relate to perception and ease of >>> > use, not what's technically possible or available. I'm coming at this as >>> > Metron should be a data analysis platform for the masses. >>> > >>> > METRON-517/542 - While I'm willing to let this one go it depends on your >>> > definition of non-issue. I personally believe that data (in every >>> location >>> > that it exists) needs to be obvious and have ultra high integrity. I'm >>> not >>> > concerned that the correct data won't exist somewhere in the cluster, I'm >>> > focusing on it being easily accessible by an operations team that may >>> > consist of entry level analysts. Once 517 is done and merged I would >>> > consider that a short term mitigation is in place. >>> > >>> > I feel like the project should stick to certain principles and a >>> suggestion >>> > is that data access is easy, accurate, and obvious. Do we have anything >>> > like this that was agreed upon, discussed, or documented? Probably a >>> > discussion for a different thread. >>> > >>> > METRON-485/470/etc. were mostly to illustrate a consistency issue that >>> and >>> > resolving them would give a better first impression (assuming that people >>> > monitoring the project will start using it more once it's non-BETA >>> > software). First impressions are big on my book and could affect initial >>> > adoption. >>> > >>> > Regarding 485 - Otto may be able to clarify but I thought somebody else >>> saw >>> > this issue as well. I think the finger is currently being pointed at >>> monit >>> > timeouts and not storm. It also doesn't happen every single time, I only >>> > run into it while the cluster is under load and after dozens of topology >>> > restarts that I do when tuning parallelism in storm. I'm going to be >>> > updating to storm 1.0.x in order to see if this still exists. Again, >>> this >>> > relates to ease of use/load testing/tuning. >>> > >>> > Agree with the upgrade comments - as long as it's supported at some >>> defined >>> > point (IMHO this is when a project leaves BETA but others are welcome to >>> > disagree). >>> > >>> > Finally, I know this doesn't come across well in email but I'm just >>> > mentioning items which I think are important, not attempting to demand >>> that >>> > they be fixed or that this doesn't leave beta. Thanks, >>> > >>> > Jon >>> > >>> > On Thu, Nov 3, 2016, 16:44 James Sirota <[email protected]> wrote: >>> > >>> > >>> > Hi Jon, >>> > >>> > Here are my thoughts around your objections. >>> > >>> > METRON-517/METRON-542 >>> > >>> > I thin the mechanism currently exists within Metron to make this a >>> > non-issue. I believe you can solve it with a combination of a Stellar >>> > statement and ES templates. As you mentioned, we can truncate the string >>> > and then include the relevant meta data in the message (original length, >>> > hash, etc). Cramming really long strings into ES is generally a bad >>> thing, >>> > which is why this limitation exists. The metadata in the indexed >>> message >>> > along with the timestamp allows you to pull data from HDFS should you >>> need >>> > to recover the full string. >>> > >>> > METRON-485 >>> > >>> > We cannot replicate this issue in our environment, but if this is indeed >>> an >>> > issue this is an issue with Storm. A Jira should be filed against Storm >>> > and not against Metron. My hunch, though, is that it's probably >>> something >>> > in your environment. I just tried stopping all topologies on my AWS >>> > cluster and then went to all Storm nodes and didn't see any workers left >>> > behind. >>> > >>> > METRON-470 >>> > >>> > I think this is mainly a consistency issue. I don't think this impacts >>> the >>> > stability or function of the software. I think this is a nice to have, >>> > maybe in the next few releases, but I don't think we absolutely have to >>> > have this to drop BETA >>> > >>> > With respect to upgrades, here are my thoughts. There is really no way >>> to >>> > upgrade Metron 0.2.1 to Metron 0.2.2 in place because it requires a >>> change >>> > of HDP. The new build will only be compatible with HDP 2.5 and not 2.4. >>> > So you have to lay down a new cluster regardless. We can document how to >>> > get the configs off of your old Metron and plug them into your new Metron >>> > so that it works the same. That shouldn't be a problem. >>> > >>> > Our upgrade path for future releases will revolve around the Ambari >>> Metron >>> > management pack that is available with the upcoming build. Right now the >>> > install capability is available and the upgrade capability will come in >>> > incrementally within the next few release. We will additionally >>> deprecate >>> > Monit and switch that functionality to Ambari as well. Finally, we will >>> > also use Ambari for metrics monitoring. There is lots to do so we will >>> > triage and prioritize Jiras as a community to see which parts we want to >>> > tackle first. This is why your participation in the community is so >>> > valuable. >>> > >>> > Thanks, >>> > James >>> > >>> > >>> > >>> > 03.11.2016, 11:07, "[email protected]" <[email protected]>: >>> > > I agree that we can split METRON-517 into a short term and long term >>> fix. >>> > > I have attempted to organize my thoughts regarding the long term fix >>> into >>> > > METRON-542 and can get a PR out for METRON-517 soon to close that out. >>> > > >>> > > This leaves cluster tuning and a valid upgrade path for users, the >>> latter >>> > of >>> > > which is my predominant concern. If the team is willing to say that >>> > > starting with 0.2.2 there will be a valid upgrade path to future >>> releases >>> > I >>> > > think that removing the BETA tag at 0.2.2 is reasonable. That said, >>> this >>> > > is just following my perception of what the BETA tag represents. >>> > > >>> > > Jon >>> > > >>> > > On Thu, Nov 3, 2016 at 11:50 AM Casey Stella <[email protected]> >>> wrote: >>> > > >>> > >> Ok, regarding METRON-517, I've thought about this a bit having read >>> > your >>> > >> really great and detailed JIRA as well as the discussion around this >>> on >>> > the >>> > >> dev list between you and Matt Foley. I want to separate the >>> discussion >>> > >> between what is the correct long-term solution for this issue versus >>> > what >>> > >> is an acceptable solution. >>> > >> >>> > >> In terms of an acceptable work-around, my opinion is that because we >>> > allow >>> > >> the user to modify the ES template they can >>> > >> >>> > >> - Adjust the template to specify ignore_above >>> > >> < >>> > >> >>> > https://www.elastic.co/guide/en/elasticsearch/reference/ >>> > current/ignore-above.html >>> > >> > >>> > >> on >>> > >> fields which they feel are likely to be large (maybe every string >>> > field) >>> > >> - The combination of timestamp and ip_src_addr should be >>> sufficient >>> > for >>> > >> picking out the raw data in question from the HDFS store >>> > >> - A stellar enrichment can be used to tag the messages with large >>> > URIs >>> > >> and that can factor into the threat triage even or be used to >>> filter >>> > in >>> > >> kibana >>> > >> - As you say, you can use the profiler to track counts of such >>> > messages >>> > >> if you so desire and factor that into threat alerting or filtering >>> > in >>> > >> kibana. >>> > >> >>> > >> Ultimately, I believe we have exposed the appropriate set of tooling >>> to >>> > >> provide an acceptable solution for the moment. Now, as for the best >>> > >> long-term solution, I will let the good discussion on the mailing >>> list >>> > and >>> > >> JIRA continue and contribute my thoughts on the JIRA >>> > >> <https://issues.apache.org/jira/browse/METRON-517>. >>> > >> >>> > >> Of course, this is just $0.02 :) >>> > >> >>> > >> Apologies to Dave, I wanted to mark this aspect of the discussion on >>> > this >>> > >> thread as it is relevant to sufficient criteria to remove the BETA >>> tag. >>> > >> >>> > >> Best, >>> > >> >>> > >> Casey >>> > >> >>> > >> On Thu, Nov 3, 2016 at 7:26 AM, [email protected] <[email protected]> >>> > wrote: >>> > >> >>> > >> > To clarify, it only needs to truncate fields > 32766 which need a >>> > >> > full/exact string match search to be run on them (analyzed fields >>> > >> generally >>> > >> > would not hit this limitation but I guess in theory they could). >>> > >> However, >>> > >> > that's probably every field which can get > 32766 because I'm >>> > assuming >>> > >> > those will all be strings. >>> > >> > >>> > >> > I also think using the profiler to monitor the truncation action >>> > could >>> > >> be a >>> > >> > useful default. >>> > >> > >>> > >> > Jon >>> > >> > >>> > >> > On Wed, Nov 2, 2016, 21:08 [email protected] <[email protected]> >>> > wrote: >>> > >> > >>> > >> > > That would break searching on uri entirely unless you queried and >>> > knew >>> > >> to >>> > >> > > truncate at 32766 because it's not analyzed. I don't like pushing >>> > that >>> > >> > > complication to the end user. >>> > >> > > >>> > >> > > I would suggest truncation in the indexingBolt (not using stellar >>> > >> because >>> > >> > > you'd want this across the board) for all fields > 32766 (how do >>> we >>> > >> make >>> > >> > > sure this gets updated if the limitation changes in Lucene?) and >>> > adding >>> > >> > > metadata key-value pairs (pre-trunc length, hash, truncated bool, >>> > >> etc.). >>> > >> > > In the URI scenario I would also suggest doing a multifield >>> mapping >>> > by >>> > >> > > default because of the way that data is useful (not sure which >>> > analyser >>> > >> > to >>> > >> > > use though - maybe write or find a good URI analyzer?). Since >>> > >> timestamp >>> > >> > is >>> > >> > > a required field for all messages (I'm pretty sure?) I'm ok with >>> > >> > timestamp >>> > >> > > and field value used as the UID, but would prefer something >>> better. >>> > >> > > >>> > >> > > Jon >>> > >> > > >>> > >> > > On Wed, Nov 2, 2016, 20:33 James Sirota <[email protected]> >>> > wrote: >>> > >> > > >>> > >> > > Jon, >>> > >> > > >>> > >> > > For METRON-517 would it suffice to have a stellar statement to >>> take >>> > a >>> > >> URI >>> > >> > > string and truncate it to length of 32766 in the ES writer? But >>> > still >>> > >> > > write the actual string to HDFS? You can then search against ES >>> on >>> > the >>> > >> > > truncated portion, but retrieve the actual timestamp from HDFS. >>> > It's >>> > >> > easy >>> > >> > > to do because you know the timestamp from the original message. >>> So >>> > you >>> > >> > > know which logs in HDFS to search through to find the data. >>> > >> > > >>> > >> > > 02.11.2016, 14:12, "[email protected]" <[email protected]>: >>> > >> > > > I personally would like to see the following things done before >>> > >> things >>> > >> > > > leave BETA: >>> > >> > > > (1) Address data integrity concerns (Specifically thinking of >>> > >> > METRON-370, >>> > >> > > > METRON-517) >>> > >> > > > (2) Make cluster tuning easier and more consistent (METRON-485, >>> > >> > > METRON-470, >>> > >> > > > and the "[DISCUSS] moving parsers back to flux" which I can't >>> > find a >>> > >> > JIRA >>> > >> > > > for). >>> > >> > > > >>> > >> > > > I would also want to see the upgrade path (as opposed to >>> rebuild) >>> > be >>> > >> > more >>> > >> > > > thoroughly and regularly tested once things leave BETA. From my >>> > >> > > > perspective I think the project is very close but not yet >>> ready. >>> > >> > > > >>> > >> > > > Jon >>> > >> > > > >>> > >> > > > On Wed, Nov 2, 2016 at 4:44 PM Casey Stella < >>> [email protected]> >>> > >> > wrote: >>> > >> > > > >>> > >> > > > Hello Everyone, >>> > >> > > > >>> > >> > > > Now that the discussion around the next release has started, it >>> > has >>> > >> > been >>> > >> > > > proposed and I think it's a good time to discuss what to name >>> > this >>> > >> next >>> > >> > > > release. Before, we have adopted the BETA suffix. I think it >>> > might be >>> > >> > > > time to drop it and call the next release 0.2.2 >>> > >> > > > >>> > >> > > > Thoughts? >>> > >> > > > >>> > >> > > > Best, >>> > >> > > > >>> > >> > > > Casey >>> > >> > > > >>> > >> > > > -- >>> > >> > > > >>> > >> > > > Jon >>> > >> > > >>> > >> > > ------------------- >>> > >> > > Thank you, >>> > >> > > >>> > >> > > James Sirota >>> > >> > > PPMC- Apache Metron (Incubating) >>> > >> > > jsirota AT apache DOT org >>> > >> > > >>> > >> > > -- >>> > >> > > >>> > >> > > Jon >>> > >> > > >>> > >> > -- >>> > >> > >>> > >> > Jon >>> > >> > >>> > > -- >>> > > >>> > > Jon >>> > >>> > ------------------- >>> > Thank you, >>> > >>> > James Sirota >>> > PPMC- Apache Metron (Incubating) >>> > jsirota AT apache DOT org >>> > >>> > -- >>> > >>> > Jon >>> > > ------------------- > Thank you, > > James Sirota > PPMC- Apache Metron (Incubating) > jsirota AT apache DOT org >
