The TRAFODION-2001 changes have reached the point where I believe it they are 
ready to merge to the Trafodion baseline as part of the Trafodion R2.2 
objectives. So consider this a petition to the Trafodion developer community 
for approval to merge into the mainline code base.

To be clear, this is a substantial change to the Trafodion Foundation 
components and has a direct effect to the Trafodion configuration, 
installation, and operational capabilities. These changes establish a 
fundamental change going forward in that it will allow Trafodion to be more in 
line with the elasticity capabilities of the Hadoop stack.

I have attached the original email I sent a year ago along with a set of slides 
with a  summary these changes.

In addition, here is the pull request which contains the changes and Jenkins 
test results:

https://github.com/apache/incubator-trafodion/pull/1077

These changes do not complete the elasticity story, but implement the necessary 
infrastructure to complete the Trafodion elasticity story. So there is more to 
come such as support in the Python installation scripts, Cloudera Manager 
parcel installation, Ambary installation, the SQL engine, and other Trafodion 
component to take advantage of this capability.

If you have any questions please add them as comments in the JIRA 
(https://issues.apache.org/jira/browse/TRAFODION-2001) so that they can be 
captured and addressed. Also, you will find an updated document with more 
details in the JIRA.

Regards,

Zalo

From: Zalo Correa
Sent: Thursday, May 4, 2017 11:15 AM
To: '[email protected]' <[email protected]>
Subject: TRAFODION-2001

I would like to draw your attention to a fairly substantial change I would like 
to commit and merge to the Apache Incubator Trafodion code base. The changes 
are described in https://issues.apache.org/jira/browse/TRAFODION-2001 and the 
code changes are currently in 
https://github.com/apache/incubator-trafodion/pull/1077.

A little background, the design and most of the implementation was done in the 
spring of 2015 and donated to the Apache Foundation at the end of September 
2015. I have worked on in for the past year among my other tasks and have 
reached a point that I think it is ready for a merge to the mainline code base 
after more thorough testing.

The current status is that Jenkins tests pass with one exception:

core-regress-core-hdp<https://jenkins.esgyn.com/job/core-regress-core-hdp/2204/>
 fails at test core TEST116

However, this test appears fail in my test environment even without the 
TRAFODION-2001 changes.

Real cluster testing is in process: functional, HA,  and performance.

As I mentioned above, the changes are substantial and your active code review 
participation would be most helpful in getting this needed functionality merged 
to the Apache Incubator Trafodion code base.

Please use the JIRA and/or the pull request as the communication vehicle for 
this activity.

Thank you in advance,

Zalo
Gonzalo Correa
Esgyn Corporation
[email protected]<mailto:[email protected]>

--- Begin Message ---
I would like to draw your attention to JIRA 
TRAFODION-2001<https://issues.apache.org/jira/browse/TRAFODION-2001> which 
specifies changes in configuration and operational components to support 
elasticity in Trafodion. My intent is to generate discussion, obtain feedback, 
correct mistakes, add missing items, and obtain consensus for when to integrate 
these changes into the mainline code. Inherent with this capability is the 
likelihood that other aspects of managing a Trafodion instance will require 
changes and possibly enhancements. At a minimum, these enhancements change the 
way current key process components are configured and managed, and the old way 
goes away (this means that you will want to know the details of this JIRA if 
you are an active contributor to Trafodion).



I am adding the contents of this email as an initial comment in the 
TRAFODION-2001 JIRA and request that all feedback be done as comments in the 
JIRA. I thank you in advance.



A little background, most of the implementation was done in the spring of 2015 
and donated to the Apache Foundation at the end of September 2015. I am in the 
process of merging these changes to the current Trafodion baseline in my 
private fork.



Here is where I need your active participation and to help with that here is a 
brief summary:



First, review the document attached to 
TRAFODION-2001<https://issues.apache.org/jira/browse/TRAFODION-2001> JIRA, as 
you will need its context for what follows here.



Current state:

Trafodion Foundation components:

'monitor/shell':

********* 'persist config/exec/info' commands are implemented

o   A 'persist kill' command is not currently specified, which I believe to be 
an unintended omission and needs to be added (it is an incomplete story without 
it as stopping persistent processes whose number grows and contracts based on 
node membership cannot be done with one simple command).

o   Some important items to consider with a 'persist kill' command:

** Will return an error when used with DTM persistent processes (the 
transaction manager process should not be stopped in haphazard way)

********* Are there other persistent processes that should also be protected in 
this manner?

** Should it return an error with TSID persistent processes?

o   The implementation of the 'persist kill' command corrects a problem with 
the code generated in the 'sscpstop', and 'ssmpstop'.

** The current code generated does not take into account new processes created 
when nodes are added.

********* 'node config' command is implemented

********* 'node add/delete' commands - TODO - in process



'scripts' changes implemented

********* Compilation of Trafodion configuration file, 'sqconfig', with new 
'persist' section is implemented ('sqgen', Et. Al. scripts)

o   The generation of 'gomon.cold' is greatly simplified as are the 
'<xxx>start' scripts

********* Creation and display of configuration data base is implemented



Location of merged changes:

git remote add zcorrea_fork 
[email protected]:zcorrea/incubator-trafodion<mailto:[email protected]:zcorrea/incubator-trafodion>

Branch: zcorrea_fork/TRAFODION-2001



Impact to other components:

Hadoop/Trafodion Installation

********* The ability to add and remove servers in an existing cluster implies 
the provisioning and removal of operational resources of those servers.

o   Trafodion depends on Hadoop and there is an implied order of provisioning 
and operational readiness when adding servers to a cluster.

o   This order will be the reverse when removing servers from a cluster.

Trafodion components

********* Existing functionality in Trafodion assumes that when an instance is 
started, its static configuration does not change. Nodes may go down, i.e., 
fail, but the number of configured node remains static. This will no longer be 
true as node membership will expand and contract in the life time of a instance 
after initial instance startup.



I look forward to your feedback,

Zalo
Gonzalo Correa






--- End Message ---

Reply via email to