[GitHub] [samza] lakshmi-manasa-g commented on a change in pull request #938: SAMZA-1531: Support run.id in standalone for batch processing.

2019-03-07 Thread GitBox
lakshmi-manasa-g commented on a change in pull request #938: SAMZA-1531: 
Support run.id in standalone for batch processing.
URL: https://github.com/apache/samza/pull/938#discussion_r263661144
 
 

 ##
 File path: 
samza-core/src/main/java/org/apache/samza/coordinator/CoordinationUtils.java
 ##
 @@ -39,6 +42,12 @@
 
   DistributedLockWithState getLockWithState(String lockId);
 
+  DistributedReadWriteLock getReadWriteLock();
 
 Review comment:
   Added lock id.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] lakshmi-manasa-g commented on a change in pull request #938: SAMZA-1531: Support run.id in standalone for batch processing.

2019-03-07 Thread GitBox
lakshmi-manasa-g commented on a change in pull request #938: SAMZA-1531: 
Support run.id in standalone for batch processing.
URL: https://github.com/apache/samza/pull/938#discussion_r263661119
 
 

 ##
 File path: 
samza-core/src/main/java/org/apache/samza/runtime/LocalApplicationRunner.java
 ##
 @@ -280,4 +368,47 @@ private void setApplicationFinalStatus() {
   }
 }
   }
+
+
+  /**
+   * Defines a specific implementation of {@link CoordinationSessionListener} 
for local {@link CoordinationUtils}
+   */
+  private final class LocalCoordinationSessionListener implements 
CoordinationSessionListener {
+
+/**
+ * If the coordination utils session has reconnected, check if global 
runid differs from local runid
+ * if it differs then shut down processor and throw exception
+ * else recreate ephemeral node corresponding to this processor inside the 
read write lock for runid
+ */
+@Override
+public void handleReconnect() {
+  LOG.info("Reconnected to coordination utils");
+  if(coordinationUtils == null) {
+return;
+  }
+  DistributedDataAccess runIdAccess = coordinationUtils.getDataAccess();
+  String globalRunId = (String) runIdAccess.readData(RUNID_PATH);
+  if( runId != globalRunId){
+processors.forEach(StreamProcessor::stop);
+cleanup();
+appStatus = ApplicationStatus.UnsuccessfulFinish;
+String msg = String.format("run.id %s on processor %s differs from the 
global run.id %s", runId, uid, globalRunId);
+throw new SamzaException(msg);
+  } else if(runIdLock != null) {
+String msg = String.format("Processor {} failed to get the lock for 
run.id", uid);
+try {
+  // acquire lock to recreate active processor ephemeral node
+  DistributedReadWriteLock.AccessType lockAccess = 
runIdLock.lock(LOCK_TIMEOUT, LOCK_TIMEOUT_UNIT);
+  if(lockAccess == DistributedReadWriteLock.AccessType.WRITE || 
lockAccess == DistributedReadWriteLock.AccessType.READ) {
+LOG.info("Processor {} creates its active processor ephemeral node 
in read write lock again" + uid);
+runIdLock.unlock();
+  } else {
+throw new SamzaException(msg);
+  }
+} catch (TimeoutException e) {
+  throw new SamzaException(msg, e);
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] lakshmi-manasa-g commented on a change in pull request #938: SAMZA-1531: Support run.id in standalone for batch processing.

2019-03-07 Thread GitBox
lakshmi-manasa-g commented on a change in pull request #938: SAMZA-1531: 
Support run.id in standalone for batch processing.
URL: https://github.com/apache/samza/pull/938#discussion_r263661134
 
 

 ##
 File path: 
samza-core/src/main/java/org/apache/samza/coordinator/CoordinationUtils.java
 ##
 @@ -39,6 +42,12 @@
 
   DistributedLockWithState getLockWithState(String lockId);
 
+  DistributedReadWriteLock getReadWriteLock();
+
+  DistributedDataAccess getDataAccess();
+
+  void setCoordinationSessionListener(CoordinationSessionListener 
sessionListener);
 
 Review comment:
   done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] lakshmi-manasa-g commented on a change in pull request #938: SAMZA-1531: Support run.id in standalone for batch processing.

2019-03-07 Thread GitBox
lakshmi-manasa-g commented on a change in pull request #938: SAMZA-1531: 
Support run.id in standalone for batch processing.
URL: https://github.com/apache/samza/pull/938#discussion_r26366
 
 

 ##
 File path: 
samza-core/src/main/java/org/apache/samza/runtime/LocalApplicationRunner.java
 ##
 @@ -280,4 +368,47 @@ private void setApplicationFinalStatus() {
   }
 }
   }
+
+
+  /**
+   * Defines a specific implementation of {@link CoordinationSessionListener} 
for local {@link CoordinationUtils}
+   */
+  private final class LocalCoordinationSessionListener implements 
CoordinationSessionListener {
+
+/**
+ * If the coordination utils session has reconnected, check if global 
runid differs from local runid
+ * if it differs then shut down processor and throw exception
+ * else recreate ephemeral node corresponding to this processor inside the 
read write lock for runid
+ */
+@Override
+public void handleReconnect() {
+  LOG.info("Reconnected to coordination utils");
+  if(coordinationUtils == null) {
+return;
+  }
+  DistributedDataAccess runIdAccess = coordinationUtils.getDataAccess();
+  String globalRunId = (String) runIdAccess.readData(RUNID_PATH);
+  if( runId != globalRunId){
+processors.forEach(StreamProcessor::stop);
+cleanup();
+appStatus = ApplicationStatus.UnsuccessfulFinish;
+String msg = String.format("run.id %s on processor %s differs from the 
global run.id %s", runId, uid, globalRunId);
+throw new SamzaException(msg);
 
 Review comment:
   Done. doing shutdownLatch.countDown() also.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] lakshmi-manasa-g commented on a change in pull request #938: SAMZA-1531: Support run.id in standalone for batch processing.

2019-03-07 Thread GitBox
lakshmi-manasa-g commented on a change in pull request #938: SAMZA-1531: 
Support run.id in standalone for batch processing.
URL: https://github.com/apache/samza/pull/938#discussion_r263661126
 
 

 ##
 File path: 
samza-core/src/main/java/org/apache/samza/runtime/LocalApplicationRunner.java
 ##
 @@ -280,4 +368,47 @@ private void setApplicationFinalStatus() {
   }
 }
   }
+
+
+  /**
+   * Defines a specific implementation of {@link CoordinationSessionListener} 
for local {@link CoordinationUtils}
+   */
+  private final class LocalCoordinationSessionListener implements 
CoordinationSessionListener {
+
+/**
+ * If the coordination utils session has reconnected, check if global 
runid differs from local runid
+ * if it differs then shut down processor and throw exception
+ * else recreate ephemeral node corresponding to this processor inside the 
read write lock for runid
+ */
+@Override
+public void handleReconnect() {
+  LOG.info("Reconnected to coordination utils");
+  if(coordinationUtils == null) {
+return;
+  }
+  DistributedDataAccess runIdAccess = coordinationUtils.getDataAccess();
+  String globalRunId = (String) runIdAccess.readData(RUNID_PATH);
+  if( runId != globalRunId){
+processors.forEach(StreamProcessor::stop);
+cleanup();
+appStatus = ApplicationStatus.UnsuccessfulFinish;
+String msg = String.format("run.id %s on processor %s differs from the 
global run.id %s", runId, uid, globalRunId);
+throw new SamzaException(msg);
+  } else if(runIdLock != null) {
+String msg = String.format("Processor {} failed to get the lock for 
run.id", uid);
+try {
+  // acquire lock to recreate active processor ephemeral node
+  DistributedReadWriteLock.AccessType lockAccess = 
runIdLock.lock(LOCK_TIMEOUT, LOCK_TIMEOUT_UNIT);
+  if(lockAccess == DistributedReadWriteLock.AccessType.WRITE || 
lockAccess == DistributedReadWriteLock.AccessType.READ) {
 
 Review comment:
   Session expiration does delete all the ephemeral nodes are cleaned up. But 
according to the ZkClient code, after an expiration two things can happen -- a 
successful reconnect in which case listener.handleNewSession is invoked or 
failure to reconnect in which case listener.handleSessionEstablishmentError is 
invoked. Handling both of these in LocalApplicationRunner now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] rmatharu commented on a change in pull request #942: Bugfix: Recent CSM refactor was causing some metrics to not be emitted. Fixed -restore-time metric.

2019-03-07 Thread GitBox
rmatharu commented on a change in pull request #942: Bugfix: Recent CSM 
refactor was causing some metrics to not be emitted. Fixed -restore-time 
metric.
URL: https://github.com/apache/samza/pull/942#discussion_r263657205
 
 

 ##
 File path: 
samza-core/src/main/scala/org/apache/samza/storage/ContainerStorageManager.java
 ##
 @@ -573,6 +576,25 @@ private StorageEngine createStore(String storeName, 
TaskName taskName, TaskModel
 return 
this.sideInputStorageManagers.values().stream().collect(Collectors.toSet());
   }
 
+  /**
+   * Registers any CSM created metrics such as side-inputs related metrics, 
and standby-task related metrics.
+   * @param metricsReporters metrics reporters to use
+   */
+  public void registerMetrics(Map metricsReporters) {
 
 Review comment:
   Done. 
   It did not require a group-name change, so all existing metrics remain as is.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] shanthoosh commented on a change in pull request #942: Bugfix: Recent CSM refactor was causing some metrics to not be emitted. Fixed -restore-time metric.

2019-03-07 Thread GitBox
shanthoosh commented on a change in pull request #942: Bugfix: Recent CSM 
refactor was causing some metrics to not be emitted. Fixed -restore-time 
metric.
URL: https://github.com/apache/samza/pull/942#discussion_r263647669
 
 

 ##
 File path: 
samza-core/src/main/scala/org/apache/samza/storage/ContainerStorageManager.java
 ##
 @@ -573,6 +576,25 @@ private StorageEngine createStore(String storeName, 
TaskName taskName, TaskModel
 return 
this.sideInputStorageManagers.values().stream().collect(Collectors.toSet());
   }
 
+  /**
+   * Registers any CSM created metrics such as side-inputs related metrics, 
and standby-task related metrics.
+   * @param metricsReporters metrics reporters to use
+   */
+  public void registerMetrics(Map metricsReporters) {
 
 Review comment:
   > SystemConsumerMetrics to take a configurable group name
   
   Alerting in ingraphs and auto-scaling rules at LinkedIn are defined based 
upon the samza generated metric names. Group name is prefix and initial part of 
the generated metric name. It will be better to factor these use-cases into 
account and maintain backwards compatibility when changing the metric names. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] vjagadish1989 edited a comment on issue #905: SAMZA-2055: [WIP] Async high level api

2019-03-07 Thread GitBox
vjagadish1989 edited a comment on issue #905: SAMZA-2055: [WIP] Async high 
level api
URL: https://github.com/apache/samza/pull/905#issuecomment-470775503
 
 
   also adding @xinyuiscool in case he has additional feedback on the async-api


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] cameronlee314 commented on a change in pull request #940: SAMZA-2121: Add checkpoint offset field to IncomingMessageEnvelope

2019-03-07 Thread GitBox
cameronlee314 commented on a change in pull request #940: SAMZA-2121: Add 
checkpoint offset field to IncomingMessageEnvelope
URL: https://github.com/apache/samza/pull/940#discussion_r263639549
 
 

 ##
 File path: 
samza-api/src/main/java/org/apache/samza/system/IncomingMessageEnvelope.java
 ##
 @@ -86,7 +101,29 @@ public IncomingMessageEnvelope(SystemStreamPartition 
systemStreamPartition, Stri
*/
   public IncomingMessageEnvelope(SystemStreamPartition systemStreamPartition, 
String offset,
   Object key, Object message, int size, long eventTime, long arrivalTime) {
-this(systemStreamPartition, offset, key, message, size);
+this(systemStreamPartition, offset, offset, key, message, size, eventTime, 
arrivalTime);
+  }
+
+  /**
+   * Constructs a new IncomingMessageEnvelope from specified components
+   * @param systemStreamPartition The aggregate object representing the 
incoming stream name, the name of the cluster
+   * from which the stream came, and the partition of the stream from which 
the message was received.
+   * @param offset The offset in the partition that the message was received 
from.
+   * @param checkpointOffset offset that can be checkpointed when this {@link 
IncomingMessageEnvelope} is processed
+   * @param key A deserialized key received from the partition offset.
+   * @param message A deserialized message received from the partition offset.
+   * @param size size of the message and key in bytes.
+   * @param eventTime the timestamp (in epochMillis) of when this event 
happened
+   * @param arrivalTime the timestamp (in epochMillis) of when this event 
arrived to (i.e., was picked-up by) Samza
+   */
+  public IncomingMessageEnvelope(SystemStreamPartition systemStreamPartition, 
String offset, String checkpointOffset,
+  Object key, Object message, int size, long eventTime, long arrivalTime) {
+this.systemStreamPartition = systemStreamPartition;
+this.offset = offset;
+this.checkpointOffset = checkpointOffset;
 
 Review comment:
   1. Some more details and an example are described at 
https://issues.apache.org/jira/browse/SAMZA-2120.
   2. We could make this part of the internal API by creating an 
`InternalIncomingMessageEnvelope` which is passed around internally and extends 
`IncomingMessageEnvelope`. Would that help avoid complicating the public API?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] mynameborat commented on a change in pull request #942: Bugfix: Recent CSM refactor was causing some metrics to not be emitted. Fixed -restore-time metric.

2019-03-07 Thread GitBox
mynameborat commented on a change in pull request #942: Bugfix: Recent CSM 
refactor was causing some metrics to not be emitted. Fixed -restore-time 
metric.
URL: https://github.com/apache/samza/pull/942#discussion_r263637333
 
 

 ##
 File path: 
samza-core/src/main/scala/org/apache/samza/storage/ContainerStorageManager.java
 ##
 @@ -573,6 +576,25 @@ private StorageEngine createStore(String storeName, 
TaskName taskName, TaskModel
 return 
this.sideInputStorageManagers.values().stream().collect(Collectors.toSet());
   }
 
+  /**
+   * Registers any CSM created metrics such as side-inputs related metrics, 
and standby-task related metrics.
+   * @param metricsReporters metrics reporters to use
+   */
+  public void registerMetrics(Map metricsReporters) {
 
 Review comment:
   +1, we would need to change the `SystemConsumerMetrics` to take a 
configurable group name instead of using the default class name as the group 
id. I feel that change is much simpler compared to copying over the boiler 
plate code for metrics registration to CSM.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] shanthoosh commented on a change in pull request #940: SAMZA-2121: Add checkpoint offset field to IncomingMessageEnvelope

2019-03-07 Thread GitBox
shanthoosh commented on a change in pull request #940: SAMZA-2121: Add 
checkpoint offset field to IncomingMessageEnvelope
URL: https://github.com/apache/samza/pull/940#discussion_r263628931
 
 

 ##
 File path: 
samza-api/src/main/java/org/apache/samza/system/IncomingMessageEnvelope.java
 ##
 @@ -86,7 +101,29 @@ public IncomingMessageEnvelope(SystemStreamPartition 
systemStreamPartition, Stri
*/
   public IncomingMessageEnvelope(SystemStreamPartition systemStreamPartition, 
String offset,
   Object key, Object message, int size, long eventTime, long arrivalTime) {
-this(systemStreamPartition, offset, key, message, size);
+this(systemStreamPartition, offset, offset, key, message, size, eventTime, 
arrivalTime);
+  }
+
+  /**
+   * Constructs a new IncomingMessageEnvelope from specified components
+   * @param systemStreamPartition The aggregate object representing the 
incoming stream name, the name of the cluster
+   * from which the stream came, and the partition of the stream from which 
the message was received.
+   * @param offset The offset in the partition that the message was received 
from.
+   * @param checkpointOffset offset that can be checkpointed when this {@link 
IncomingMessageEnvelope} is processed
+   * @param key A deserialized key received from the partition offset.
+   * @param message A deserialized message received from the partition offset.
+   * @param size size of the message and key in bytes.
+   * @param eventTime the timestamp (in epochMillis) of when this event 
happened
+   * @param arrivalTime the timestamp (in epochMillis) of when this event 
arrived to (i.e., was picked-up by) Samza
+   */
+  public IncomingMessageEnvelope(SystemStreamPartition systemStreamPartition, 
String offset, String checkpointOffset,
+  Object key, Object message, int size, long eventTime, long arrivalTime) {
+this.systemStreamPartition = systemStreamPartition;
+this.offset = offset;
+this.checkpointOffset = checkpointOffset;
 
 Review comment:
   1. Can you please share the rationale behind adding checkpoint offset to 
`IncomingMessageEnvelope` public API. 
   2. Can you please throw some light on the other options considered here? Is 
it not plausible to do this without making this change in public API. 
   
   IMHO, it'll be better not to proliferate the public API with these internal 
details which will be hard to remove later on. Anyone who reads this in the 
future, would wonder why there are two different offsets in 
`IncomingMessageEnvelope` contract. Is it possible to maintain the 
mapping(association we want for safe-checkpoint) internally within the samza 
framework side and not expose it to samza-users.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] xinyuiscool merged pull request #943: Update Samza version for 1.1.0 release branch

2019-03-07 Thread GitBox
xinyuiscool merged pull request #943: Update Samza version for 1.1.0 release 
branch
URL: https://github.com/apache/samza/pull/943
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] shanthoosh commented on a change in pull request #940: SAMZA-2121: Add checkpoint offset field to IncomingMessageEnvelope

2019-03-07 Thread GitBox
shanthoosh commented on a change in pull request #940: SAMZA-2121: Add 
checkpoint offset field to IncomingMessageEnvelope
URL: https://github.com/apache/samza/pull/940#discussion_r263628931
 
 

 ##
 File path: 
samza-api/src/main/java/org/apache/samza/system/IncomingMessageEnvelope.java
 ##
 @@ -86,7 +101,29 @@ public IncomingMessageEnvelope(SystemStreamPartition 
systemStreamPartition, Stri
*/
   public IncomingMessageEnvelope(SystemStreamPartition systemStreamPartition, 
String offset,
   Object key, Object message, int size, long eventTime, long arrivalTime) {
-this(systemStreamPartition, offset, key, message, size);
+this(systemStreamPartition, offset, offset, key, message, size, eventTime, 
arrivalTime);
+  }
+
+  /**
+   * Constructs a new IncomingMessageEnvelope from specified components
+   * @param systemStreamPartition The aggregate object representing the 
incoming stream name, the name of the cluster
+   * from which the stream came, and the partition of the stream from which 
the message was received.
+   * @param offset The offset in the partition that the message was received 
from.
+   * @param checkpointOffset offset that can be checkpointed when this {@link 
IncomingMessageEnvelope} is processed
+   * @param key A deserialized key received from the partition offset.
+   * @param message A deserialized message received from the partition offset.
+   * @param size size of the message and key in bytes.
+   * @param eventTime the timestamp (in epochMillis) of when this event 
happened
+   * @param arrivalTime the timestamp (in epochMillis) of when this event 
arrived to (i.e., was picked-up by) Samza
+   */
+  public IncomingMessageEnvelope(SystemStreamPartition systemStreamPartition, 
String offset, String checkpointOffset,
+  Object key, Object message, int size, long eventTime, long arrivalTime) {
+this.systemStreamPartition = systemStreamPartition;
+this.offset = offset;
+this.checkpointOffset = checkpointOffset;
 
 Review comment:
   1. Can you please share the rationale behind adding checkpoint offset to 
`IncomingMessageEnvelope` public API. 
   2. Can you please throw some light on the other options considered here? Is 
it not plausible to do this making this change in public API. 
   
   IMHO, it'll be better not to proliferate the public API with these internal 
details which will be hard to remove later on. Anyone who reads this in the 
future, would wonder why there are two different offsets in 
`IncomingMessageEnvelope` contract. Is it possible to maintain the 
mapping(association we want for safe-checkpoint) internally within the samza 
framework side and not expose it to samza-users.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] shanthoosh commented on a change in pull request #940: SAMZA-2121: Add checkpoint offset field to IncomingMessageEnvelope

2019-03-07 Thread GitBox
shanthoosh commented on a change in pull request #940: SAMZA-2121: Add 
checkpoint offset field to IncomingMessageEnvelope
URL: https://github.com/apache/samza/pull/940#discussion_r263628931
 
 

 ##
 File path: 
samza-api/src/main/java/org/apache/samza/system/IncomingMessageEnvelope.java
 ##
 @@ -86,7 +101,29 @@ public IncomingMessageEnvelope(SystemStreamPartition 
systemStreamPartition, Stri
*/
   public IncomingMessageEnvelope(SystemStreamPartition systemStreamPartition, 
String offset,
   Object key, Object message, int size, long eventTime, long arrivalTime) {
-this(systemStreamPartition, offset, key, message, size);
+this(systemStreamPartition, offset, offset, key, message, size, eventTime, 
arrivalTime);
+  }
+
+  /**
+   * Constructs a new IncomingMessageEnvelope from specified components
+   * @param systemStreamPartition The aggregate object representing the 
incoming stream name, the name of the cluster
+   * from which the stream came, and the partition of the stream from which 
the message was received.
+   * @param offset The offset in the partition that the message was received 
from.
+   * @param checkpointOffset offset that can be checkpointed when this {@link 
IncomingMessageEnvelope} is processed
+   * @param key A deserialized key received from the partition offset.
+   * @param message A deserialized message received from the partition offset.
+   * @param size size of the message and key in bytes.
+   * @param eventTime the timestamp (in epochMillis) of when this event 
happened
+   * @param arrivalTime the timestamp (in epochMillis) of when this event 
arrived to (i.e., was picked-up by) Samza
+   */
+  public IncomingMessageEnvelope(SystemStreamPartition systemStreamPartition, 
String offset, String checkpointOffset,
+  Object key, Object message, int size, long eventTime, long arrivalTime) {
+this.systemStreamPartition = systemStreamPartition;
+this.offset = offset;
+this.checkpointOffset = checkpointOffset;
 
 Review comment:
   1. Can you please share the rationale behind adding checkpoint offset to 
`IncomingMessageEnvelope` public API. 
   2. Can you please throw some light on the other options considered here? Is 
it not plausible to do this making this change in public API. 
   
   IMHO, it'll be better not to proliferate the public API with these internal 
details which will be hard to remove later on. Anyone who reads this in the 
future, would wonder why there are two different offsets in 
`IncomingMessageEnvelope` contract. Is it possible to maintain the 
mapping(association we want for safe-checkpoint) internally within the samza 
framework side and expose it to users.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] dxichen opened a new pull request #944: Release version updates

2019-03-07 Thread GitBox
dxichen opened a new pull request #944: Release version updates
URL: https://github.com/apache/samza/pull/944
 
 
   Updated hard coded version for the upcoming release


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] dxichen opened a new pull request #943: Update Samza version for 1.1.0 release branch

2019-03-07 Thread GitBox
dxichen opened a new pull request #943: Update Samza version for 1.1.0 release 
branch
URL: https://github.com/apache/samza/pull/943
 
 
   Change versions as per RELEASE.md


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: Trouble running samza 0.14 in standalone mode

2019-03-07 Thread Yi Pan
Great! Glad that you were able to figure it out!

-Yi

On Thu, Mar 7, 2019 at 3:14 AM Anoop Krishnakumar <
anoop.krishnaku...@gmail.com> wrote:

> Hi Yi,
>
> Apologies for my ignorance. I did not realize that attachments wont make
> through and gist is the preferred method of sharing logs and code snippets.
>
> Issue is resolved. I was using the default task.name.grouper.factory
> instead of GroupByContainerIdsFactory.
> Appreciate and thanks for the response. I will consider using 1.0 version.
>
> -anoop
>
>
> On Thu, 7 Mar 2019 at 01:41, Yi Pan  wrote:
>
> > Hi, Anoop,
> >
> > 1. Please provide the full log file if possible. Just listing out a
> single
> > log line reporting the failure does not help.
> > 2. Is there any reason that you still stay with Samza 0.14? I would
> highly
> > recommend to upgrade to 1.0 since there are tons of API and standalone
> > related improvements went in to that version after 0.14.
> >
> > -Yi
> >
> > On Wed, Mar 6, 2019 at 1:54 PM Anoop Krishnakumar <
> > anoop.krishnaku...@gmail.com> wrote:
> >
> > > I am prototyping an application where I have two input topics and two
> > > output topics. The application also uses a rocksdb state store that is
> > > backed by kafka.
> > >
> > > I followed WikipediaZkLocalApplication example, but the application
> > always
> > > exits without listening for messages from topic. I could see the
> > execution
> > > plan and job coordinator communicating with zookeeper. I could see the
> > > following line in logs
> > > StreamProcessor [INFO] Container is not instantiated for stream
> > processor:
> > > 5f7e2a46-fb7a-4054-bf5c-c62423ca2e35.
> > > I could not figure out why the container wouldn't start. Any help would
> > be
> > > appreciated.
> > >
> > > Attached is the log and the job configuration.
> > >
> > > -anoop
> > >
> >
>


[GitHub] [samza] cameronlee314 edited a comment on issue #941: SAMZA-2120: Enable custom handling of ConsumerRecords consumed by Kafka

2019-03-07 Thread GitBox
cameronlee314 edited a comment on issue #941: SAMZA-2120: Enable custom 
handling of ConsumerRecords consumed by Kafka 
URL: https://github.com/apache/samza/pull/941#issuecomment-470745081
 
 
   > @cameronlee314 I'd prefer to not add new fields/semantics to IME if 
possible. I'm assuming you want to return the safe offset from LiKafka as the 
checkpoint offset? Any other alternatives to handle it?
   
   That's correct.
   I have sent over some alternative solutions separately.
   For questions about changing IME, could we please use 
https://github.com/apache/samza/pull/940? This one just has the IME changes 
since it depends on those and I haven't checked in the other PR yet.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] cameronlee314 commented on issue #941: SAMZA-2120: Enable custom handling of ConsumerRecords consumed by Kafka

2019-03-07 Thread GitBox
cameronlee314 commented on issue #941: SAMZA-2120: Enable custom handling of 
ConsumerRecords consumed by Kafka 
URL: https://github.com/apache/samza/pull/941#issuecomment-470745081
 
 
   > @cameronlee314 I'd prefer to not add new fields/semantics to IME if 
possible. I'm assuming you want to return the safe offset from LiKafka as the 
checkpoint offset? Any other alternatives to handle it?
   
   That's correct.
   I have sent over some alternative solutions separately.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] cameronlee314 commented on issue #940: SAMZA-2121: Add checkpoint offset field to IncomingMessageEnvelope

2019-03-07 Thread GitBox
cameronlee314 commented on issue #940: SAMZA-2121: Add checkpoint offset field 
to IncomingMessageEnvelope
URL: https://github.com/apache/samza/pull/940#issuecomment-470744806
 
 
   > Looks good.
   > General question around checkpoint offset vs offset:
   > Are there other systems that will benefit with this separation? To me it 
seems this separation arises only in the case of Kafka specifically large 
message scenario.
   
   We don't currently have any other concrete use cases for this, but I was 
thinking that this was still a general enough abstraction that it could apply 
to other systems if they needed to modify offsets.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] mynameborat commented on a change in pull request #940: SAMZA-2121: Add checkpoint offset field to IncomingMessageEnvelope

2019-03-07 Thread GitBox
mynameborat commented on a change in pull request #940: SAMZA-2121: Add 
checkpoint offset field to IncomingMessageEnvelope
URL: https://github.com/apache/samza/pull/940#discussion_r263599960
 
 

 ##
 File path: samza-core/src/test/java/org/apache/samza/task/TestAsyncRunLoop.java
 ##
 @@ -51,7 +51,7 @@
 import scala.Option;
 import scala.collection.JavaConverters;
 
-import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.*;
 
 Review comment:
   nit: use explicit imports instead of `*`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: "send to" ordering is inconsistent

2019-03-07 Thread Prateek Maheshwari
Hi Tom,

It looks like we won't be able to include SAMZA-2116 in the upcoming 1.1
release due to time constraints. It'll have to go in to the 1.2 release,
which will tentatively be in June. Does that still work for you?

Thanks,
Prateek

On Thu, Feb 28, 2019 at 2:16 PM Tom Davis  wrote:

> Thanks, Prateek! Yes, the workaround will be fine for the time being.
> Thank you again!
>
> Prateek Maheshwari  writes:
>
> > Hi Tom,
> >
> > Thanks for reporting this. I created a ticket (SAMZA-2116
> > ) to make the required
> > API changes. We'll include this in the next Samza release, which should
> be
> > mid to late next month.
> >
> > In the mean time, the workaround would be to keep all of this
> functionality
> > in a sink function. Does this work for you?
> >
> > Thanks,
> > Prateek
> >
> > On Wed, Feb 27, 2019 at 2:54 PM Tom Davis 
> wrote:
> >
> >>
> >> Prateek Maheshwari  writes:
> >>
> >> > Hi Tom,
> >> >
> >> > I'm assuming that the two sub-DAGs you're talking about are the two
> Map
> >> ->
> >> > Send To chains acting on the "audit-report-requests" input and sending
> >> > their results to the "audit-report-status" output.
> >> >
> >>
> >> Yes, that's correct.
> >>
> >> >
> >> > Although processing within each Task is in-order, the framework does
> not
> >> > guarantee the order in which the multiple chained operators for an
> >> operator
> >> > are evaluated. Specifically, in the current implementation
> >> > <
> >>
> https://github.com/apache/samza/blob/master/samza-core/src/main/java/org/apache/samza/operators/impl/OperatorImpl.java#L106
> >> >,
> >> > an Operator's registeredOperators are maintained as a HashSet of
> >> > OperatorImpls. This would explain the out-of-order appearance of the
> two
> >> > messages. I'm not sure what's changed in 1.0 that makes this trigger
> now.
> >> >
> >>
> >> Ah! I thought this was the case but I couldn't find the part of the code
> >> to prove it. This makes far more sense than Kafka routinely not
> >> committing messages in order (though it is still technically a
> >> possibility).
> >>
> >> Upon further investigation, I'm not convinced it's a 1.0 issue; I think
> >> we just started using multiple chained operators more heavily.
> >>
> >> >
> >> > Since both sendTo and sink are terminal operators (void return type),
> I
> >> > don't think you'll be able to easily get around this. Let me discuss
> this
> >> > with the team and get back to you with a workaround / fix.
> >> >
> >>
> >> Thanks a lot! <3
> >>
> >> >
> >> > Thanks,
> >> > Prateek
> >> >
> >> >
> >> > On Tue, Feb 26, 2019 at 7:08 PM Tom Davis 
> >> wrote:
> >> >
> >> >> Hey folks!
> >> >>
> >> >> We have noticed some inconsistencies in message ordering when
> running a
> >> >> StreamApplication that calls two separate `map` functions over an
> input
> >> >> and sends results to the same output. I have attached my Execution
> Plan,
> >> >> but the gist is that the first `map` function marks a thing as
> "pending"
> >> >> by sending a message to a status topic and the second `map` function
> >> >> does some work then sends its own status with "done".
> >> >>
> >> >> We have a test set up to read the resulting status topic with a
> normal
> >> >> Kafka consumer to ensure that two status messages were produced by
> Samza
> >> >> and consumed in the proper order (first "pending", then "done", per
> the
> >> >> order of the MessageStream call chains). This test flaps pretty
> >> >> routinely since upgrading to Samza 1.0; we never noticed this in the
> >> >> past. Sometimes, it times out waiting for any messages, though that's
> >> >> considerably less rare than the ordering issue. My understanding is,
> for
> >> >> a given Task, by default, all processing should be done serially. Is
> >> >> that no longer true? Is the guarantee *only* for the order in which
> >> >> messages are consumed, not produced?
> >> >>
> >> >> For test simplicity, there's a single Kafka partition for each topic
> and
> >> >> I attempted to create a configuration file that would eliminate as
> much
> >> >> coordination and concurrency sources as I knew how:
> >> >>
> >> >>   processor.id=0
> >> >>
> >> >>
> >>
> job.coordinator.factory=org.apache.samza.standalone.PassthroughJobCoordinatorFactory
> >> >>   job.container.single.thread.mode=true
> >> >>
> >> >> (We use the ZkJobCoordinatorFactory normally but both produce the
> bug)
> >> >>
> >> >> I realize the KafkaProducer does not *technically* guarantee delivery
> >> >> order except when using transactions, which KafkaSystemProducer
> doesn't
> >> >> appear to do by default. I have checked the actual message envelope
> and
> >> >> when the ordering is wrong, the offset order is correct -- so, "done"
> >> >> was recorded by Kafka prior to "pending". This seems to rule out
> Samza
> >> >> but I'm not entirely confident in that conclusion. Any thoughts?
> >> >>
> >> >> Thanks,
> >> >>
> >> >> Tom
> >> >>
> >>
>


Re: [DISCUSS] Samza 1.1.0 release

2019-03-07 Thread Daniel Chen
Unfortunately, due to the timeline of the release and the current status of
SAMZA-2116, we will not be able to complete it for this release.
I will try to target that for the next release (tentatively in June).

- Daniel

On Thu, Mar 7, 2019 at 1:43 PM Jacob Maes  wrote:

> I think the schedule sounds good. A release would be great.
>
> On Thu, Mar 7, 2019 at 10:38 AM Prateek Maheshwari 
> wrote:
>
> > Daniel, let's try to include the following change in the release as well.
> > SAMZA-2116: Make sendTo and sink operators non-terminal
> >
> > Other than that, +1 (binding).
> >
> > - Prateek
> >
> >
> >
> > On Thu, Mar 7, 2019 at 9:22 AM Xinyu Liu  wrote:
> >
> > > +1 (binding)
> > >
> > > Thanks,
> > > Xinyu
> > >
> > > On Thu, Mar 7, 2019 at 12:43 AM santhosh venkat <
> > > santhoshvenkat1...@gmail.com> wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > Thanks,
> > > >
> > > > On Wed, Mar 6, 2019 at 10:42 PM Yi Pan  wrote:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > On Wed, Mar 6, 2019 at 10:08 PM Daniel Chen 
> > wrote:
> > > > >
> > > > > > Hello everyone,
> > > > > >
> > > > > > We have added couple of major features to master since 1.0.0 that
> > > > > warrants
> > > > > > a major release.
> > > > > >
> > > > > > Within LinkedIn, some of these features have already been tested
> as
> > > > part
> > > > > of
> > > > > > our test suites. We plan to continue our testing in coming weeks
> to
> > > > > > validate the stability prior to release.
> > > > > >
> > > > > > Here is the highlighted list of features that are part of the new
> > > > release
> > > > > > (in chronological order)
> > > > > > SAMZA-1981
> > > > > > Consolidate table descriptors to samza-api
> > > > > > SAMZA-1985
> > > > > > Implement Startpoints model and StartpointManager
> > > > > > SAMZA-1998
> > > > > > Table API refactoring
> > > > > > SAMZA-2012
> > > > > > Add API for wiring an external context through to application
> > > > processing
> > > > > > code
> > > > > > SAMZA-2041
> > > > > > Add system descriptors for HDFS and Kinesis
> > > > > > SAMZA-2043
> > > > > > Consolidate ReadableTable and ReadWriteTable
> > > > > > SAMZA-2106
> > > > > > Samza App & Job Config Refactor
> > > > > > SAMZA-2081
> > > > > > Samza SQL : Type system for Samza SQL
> > > > > >
> > > > > > You can find a complete list of features here:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fissues%2F%3Fjql%3Dproject%2520%253D%2520SAMZA%2520AND%2520resolution%2520%2520%253D%2520Fixed%2520%2520AND%2520(fixVersion%2520%253E%253D%25201.1%2520)%2520ORDER%2520BY%2520createdDate%2520%2520DESCdata=02%7C01%7Cdchen1%40linkedin.com%7C01251a7438ea4324f3f608d6a2c11a53%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636875347611087937sdata=ZDMaQj5vX6Vlm%2B8vpGhrNygxpI2vvNnYGi1USWe%2FD5A%3Dreserved=0
> > > > > >
> > > > > > Here is my proposal on our release schedule and timelines.
> > > > > >
> > > > > >1. Cut a release version 1.1.0 from master
> > > > > >2. Target a release vote on the week March 13th (next week)
> > > > > >
> > > > > > Thoughts?
> > > > > >
> > > > > > Thanks,
> > > > > > Daniel
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [POSSIBLE PHISHING] Task Partition Commit Failed After Upgrade

2019-03-07 Thread Prateek Maheshwari
Jeremiah, were you able to resolve this issue?

- Prateek

On Wed, Mar 6, 2019 at 10:08 AM Prateek Maheshwari 
wrote:

> Hi Jeremiah,
>
> The configuration you want to look for is:
> 'job.systemstreampartition.grouper.factory'. It should default to:
> 'org.apache.samza.container.grouper.stream.GroupByPartitionFactory'.
> Can you check if you see this value in the configuration logged by
> SamzaContainer during container start? You can grep for: "Using
> configuration".
>
> For context, there are two groupers for a Samza job. One that groups input
> partitions into tasks (this one), and one that groups tasks into containers
> (the one you mentioned above).
>
> Thanks,
> Prateek
>
>
>
> On Wed, Mar 6, 2019 at 8:14 AM Jeremiah Adams 
> wrote:
>
>> It appears that the issue is related to the KafkaCheckpointLogKey.java
>> constructor. grouperFactoryClassName here is null.  THe documentation
>> indicates that task.name.grouper.factory config setting has a default value
>> of
>>  org.apache.samza.container.grouper.task.GroupByContainerCountFactory. I
>> wouldn't expect it to be null here.
>>
>> If I specify GroupByContainerCountFactory for the
>> task.name.grouper.factory in my properties file, I get a
>> NoSuchMethodException:
>>
>> Exception in thread "main" java.lang.InstantiationException:
>> org.apache.samza.container.grouper.task.GroupByContainerCount
>> at java.lang.Class.newInstance(Class.java:427)
>> at org.apache.samza.util.Util$.getObj(Util.scala:80)
>> at
>> org.apache.samza.coordinator.JobModelManager$.readJobModel(JobModelManager.scala:261)
>> at
>> org.apache.samza.coordinator.JobModelManager$.getJobModelManager(JobModelManager.scala:155)
>> at
>> org.apache.samza.coordinator.JobModelManager$.apply(JobModelManager.scala:117)
>> at
>> org.apache.samza.coordinator.JobModelManager.apply(JobModelManager.scala)
>> at
>> org.apache.samza.clustermanager.ClusterBasedJobCoordinator.buildJobModelManager(ClusterBasedJobCoordinator.java:241)
>> at
>> org.apache.samza.clustermanager.ClusterBasedJobCoordinator.(ClusterBasedJobCoordinator.java:152)
>> at
>> org.apache.samza.clustermanager.ClusterBasedJobCoordinator.main(ClusterBasedJobCoordinator.java:297)
>> Caused by: java.lang.NoSuchMethodException:
>> org.apache.samza.container.grouper.task.GroupByContainerCount.()
>> at java.lang.Class.getConstructor0(Class.java:3082)
>> at java.lang.Class.newInstance(Class.java:412)
>> ... 8 more
>>
>>
>>
>> Jeremiah Adams
>> Software Engineer
>> www.helixeducation.com 
>> Blog  | Twitter <
>> https://twitter.com/HelixEducation> | Facebook <
>> https://www.facebook.com/HelixEducation> | LinkedIn <
>> http://www.linkedin.com/company/3609946>
>>
>>
>> On 3/4/19, 2:48 PM, "Jeremiah Adams"  wrote:
>>
>> I am updating dependencies and moving from Samza V0.13.0 to V0.14.0.
>> I develop locally using the grid app in the hello-samza project to spin up
>> local yarn/zookeeper/kafka instances.
>>
>> Grid is running these versions:
>> kafka_2.11-0.10.2.1.tgz
>> hadoop-2.6.1.tar.gz
>> zookeeper-3.4.3.tar.gz
>>
>>
>> My job is now failing with the NPE below. anyone have ideas on the
>> cause of this error?
>>
>>
>> 2019-03-04 14:13:49 AsyncRunLoop [ERROR] Task Partition 0 commit
>> failed
>> java.lang.NullPointerException
>> at
>> com.google.common.base.Preconditions.checkNotNull(Preconditions.java:782)
>> at
>> org.apache.samza.checkpoint.kafka.KafkaCheckpointLogKey.(KafkaCheckpointLogKey.java:46)
>> at
>> org.apache.samza.checkpoint.kafka.KafkaCheckpointManager.writeCheckpoint(KafkaCheckpointManager.scala:136)
>> at
>> org.apache.samza.checkpoint.OffsetManager.writeCheckpoint(OffsetManager.scala:259)
>> at
>> org.apache.samza.container.TaskInstance.commit(TaskInstance.scala:205)
>> at
>> org.apache.samza.task.AsyncRunLoop$AsyncTaskWorker$5.run(AsyncRunLoop.java:494)
>> at
>> org.apache.samza.task.AsyncRunLoop$AsyncTaskWorker.commit(AsyncRunLoop.java:513)
>> at
>> org.apache.samza.task.AsyncRunLoop$AsyncTaskWorker.run(AsyncRunLoop.java:379)
>> at
>> org.apache.samza.task.AsyncRunLoop$AsyncTaskWorker.access$300(AsyncRunLoop.java:314)
>> at
>> org.apache.samza.task.AsyncRunLoop.runTasks(AsyncRunLoop.java:228)
>> at
>> org.apache.samza.task.AsyncRunLoop.run(AsyncRunLoop.java:157)
>> at
>> org.apache.samza.container.SamzaContainer.run(SamzaContainer.scala:728)
>> at
>> org.apache.samza.runtime.LocalContainerRunner.run(LocalContainerRunner.java:102)
>> at
>> org.apache.samza.runtime.LocalContainerRunner.main(LocalContainerRunner.java:147)
>> 2019-03-04 14:13:49 AsyncRunLoop [ERROR] Caught throwable and
>> stopping run loop
>>
>>
>>
>> Jeremiah Adams
>> Software 

[GitHub] [samza] prateekm commented on issue #941: SAMZA-2120: Enable custom handling of ConsumerRecords consumed by Kafka

2019-03-07 Thread GitBox
prateekm commented on issue #941: SAMZA-2120: Enable custom handling of 
ConsumerRecords consumed by Kafka 
URL: https://github.com/apache/samza/pull/941#issuecomment-470707705
 
 
   @cameronlee314 I'd prefer to not add new fields/semantics to IME if 
possible. I'm assuming you want to return the safe offset from LiKafka as the 
checkpoint offset? Any other alternatives to handle it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: [DISCUSS] Samza 1.1.0 release

2019-03-07 Thread Jacob Maes
I think the schedule sounds good. A release would be great.

On Thu, Mar 7, 2019 at 10:38 AM Prateek Maheshwari 
wrote:

> Daniel, let's try to include the following change in the release as well.
> SAMZA-2116: Make sendTo and sink operators non-terminal
>
> Other than that, +1 (binding).
>
> - Prateek
>
>
>
> On Thu, Mar 7, 2019 at 9:22 AM Xinyu Liu  wrote:
>
> > +1 (binding)
> >
> > Thanks,
> > Xinyu
> >
> > On Thu, Mar 7, 2019 at 12:43 AM santhosh venkat <
> > santhoshvenkat1...@gmail.com> wrote:
> >
> > > +1 (non-binding)
> > >
> > > Thanks,
> > >
> > > On Wed, Mar 6, 2019 at 10:42 PM Yi Pan  wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > On Wed, Mar 6, 2019 at 10:08 PM Daniel Chen 
> wrote:
> > > >
> > > > > Hello everyone,
> > > > >
> > > > > We have added couple of major features to master since 1.0.0 that
> > > > warrants
> > > > > a major release.
> > > > >
> > > > > Within LinkedIn, some of these features have already been tested as
> > > part
> > > > of
> > > > > our test suites. We plan to continue our testing in coming weeks to
> > > > > validate the stability prior to release.
> > > > >
> > > > > Here is the highlighted list of features that are part of the new
> > > release
> > > > > (in chronological order)
> > > > > SAMZA-1981
> > > > > Consolidate table descriptors to samza-api
> > > > > SAMZA-1985
> > > > > Implement Startpoints model and StartpointManager
> > > > > SAMZA-1998
> > > > > Table API refactoring
> > > > > SAMZA-2012
> > > > > Add API for wiring an external context through to application
> > > processing
> > > > > code
> > > > > SAMZA-2041
> > > > > Add system descriptors for HDFS and Kinesis
> > > > > SAMZA-2043
> > > > > Consolidate ReadableTable and ReadWriteTable
> > > > > SAMZA-2106
> > > > > Samza App & Job Config Refactor
> > > > > SAMZA-2081
> > > > > Samza SQL : Type system for Samza SQL
> > > > >
> > > > > You can find a complete list of features here:
> > > > >
> > > > >
> > > >
> > >
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fissues%2F%3Fjql%3Dproject%2520%253D%2520SAMZA%2520AND%2520resolution%2520%2520%253D%2520Fixed%2520%2520AND%2520(fixVersion%2520%253E%253D%25201.1%2520)%2520ORDER%2520BY%2520createdDate%2520%2520DESCdata=02%7C01%7Cdchen1%40linkedin.com%7C01251a7438ea4324f3f608d6a2c11a53%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636875347611087937sdata=ZDMaQj5vX6Vlm%2B8vpGhrNygxpI2vvNnYGi1USWe%2FD5A%3Dreserved=0
> > > > >
> > > > > Here is my proposal on our release schedule and timelines.
> > > > >
> > > > >1. Cut a release version 1.1.0 from master
> > > > >2. Target a release vote on the week March 13th (next week)
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > Thanks,
> > > > > Daniel
> > > > >
> > > >
> > >
> >
>


[GitHub] [samza] cameronlee314 commented on a change in pull request #941: SAMZA-2120: Enable custom handling of ConsumerRecords consumed by Kafka

2019-03-07 Thread GitBox
cameronlee314 commented on a change in pull request #941: SAMZA-2120: Enable 
custom handling of ConsumerRecords consumed by Kafka 
URL: https://github.com/apache/samza/pull/941#discussion_r263575676
 
 

 ##
 File path: 
samza-core/src/main/java/org/apache/samza/storage/TaskSideInputStorageManager.java
 ##
 @@ -74,7 +74,7 @@
   private final SystemAdmins systemAdmins;
   private final TaskName taskName;
   private final TaskMode taskMode;
-  private final Map lastProcessedOffsets = new 
ConcurrentHashMap<>();
 
 Review comment:
   I was trying to clarify that this field no longer stored the actual "last 
processed offsets", since we now need to store the "checkpoint offset" for 
writing to the offset files.
   Do you think we should also track the actual "last processed offsets" 
explicitly for the metric? Or would the "checkpoint offset" be sufficient for 
the metric?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] rmatharu commented on a change in pull request #941: SAMZA-2120: Enable custom handling of ConsumerRecords consumed by Kafka

2019-03-07 Thread GitBox
rmatharu commented on a change in pull request #941: SAMZA-2120: Enable custom 
handling of ConsumerRecords consumed by Kafka 
URL: https://github.com/apache/samza/pull/941#discussion_r263554894
 
 

 ##
 File path: samza-core/src/main/java/org/apache/samza/task/AsyncRunLoop.java
 ##
 @@ -611,10 +611,12 @@ public void run() {
 List callbacksToUpdate = 
callbackManager.updateCallback(callbackImpl);
 for (TaskCallbackImpl callbackToUpdate : callbacksToUpdate) {
   IncomingMessageEnvelope envelope = callbackToUpdate.envelope;
-  log.trace("Update offset for ssp {}, offset {}", 
envelope.getSystemStreamPartition(), envelope.getOffset());
+  log.trace("Update offset for ssp {}, checkpoint offset {}", 
envelope.getSystemStreamPartition(),
 
 Review comment:
   log.trace("Update offset for ssp {}, offset {}, checkpoint offset {}", 
envelope.getSystemStreamPartition(), envelope.getOffset(), 
envelope.getCheckpointOffset());


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] rmatharu commented on a change in pull request #941: SAMZA-2120: Enable custom handling of ConsumerRecords consumed by Kafka

2019-03-07 Thread GitBox
rmatharu commented on a change in pull request #941: SAMZA-2120: Enable custom 
handling of ConsumerRecords consumed by Kafka 
URL: https://github.com/apache/samza/pull/941#discussion_r263553107
 
 

 ##
 File path: 
samza-core/src/main/java/org/apache/samza/storage/TaskSideInputStorageManager.java
 ##
 @@ -74,7 +74,7 @@
   private final SystemAdmins systemAdmins;
   private final TaskName taskName;
   private final TaskMode taskMode;
-  private final Map lastProcessedOffsets = new 
ConcurrentHashMap<>();
 
 Review comment:
   Perhaps undo this change. 
   Since this map actually stores the "last processed offset for the given side 
input" (which is used to emit a metric). 
   SideInputs are only checkpointed locally (see writeOffsetFiles).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] vjagadish1989 commented on issue #905: SAMZA-2055: [WIP] Async high level api

2019-03-07 Thread GitBox
vjagadish1989 commented on issue #905: SAMZA-2055: [WIP] Async high level api
URL: https://github.com/apache/samza/pull/905#issuecomment-470672963
 
 
   1. Can you add an e2e integration test for a `StreamApp` + async-api
   2. Additionally, update our website-docs and add public code examples for 
usage
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] rmatharu commented on a change in pull request #942: Bugfix: Recent CSM refactor was causing some metrics to not be emitted. Fixed -restore-time metric.

2019-03-07 Thread GitBox
rmatharu commented on a change in pull request #942: Bugfix: Recent CSM 
refactor was causing some metrics to not be emitted. Fixed -restore-time 
metric.
URL: https://github.com/apache/samza/pull/942#discussion_r263535250
 
 

 ##
 File path: 
samza-core/src/main/scala/org/apache/samza/storage/ContainerStorageManager.java
 ##
 @@ -573,6 +576,25 @@ private StorageEngine createStore(String storeName, 
TaskName taskName, TaskModel
 return 
this.sideInputStorageManagers.values().stream().collect(Collectors.toSet());
   }
 
+  /**
+   * Registers any CSM created metrics such as side-inputs related metrics, 
and standby-task related metrics.
+   * @param metricsReporters metrics reporters to use
+   */
+  public void registerMetrics(Map metricsReporters) {
 
 Review comment:
   That's true, but that pattern doesn't work in this case because both the CSM 
and SamzaContainer need to have a different registry because they both create 
SystemConsumersMetrics. Using the same registry causes a conflict in the 
metrics' name. 
   We do use this pattern in TaskInstance, where each TaskInstance is given a 
set of reporters and it registers its metrics there.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] rmatharu commented on a change in pull request #942: Bugfix: Recent CSM refactor was causing some metrics to not be emitted. Fixed -restore-time metric.

2019-03-07 Thread GitBox
rmatharu commented on a change in pull request #942: Bugfix: Recent CSM 
refactor was causing some metrics to not be emitted. Fixed -restore-time 
metric.
URL: https://github.com/apache/samza/pull/942#discussion_r263535250
 
 

 ##
 File path: 
samza-core/src/main/scala/org/apache/samza/storage/ContainerStorageManager.java
 ##
 @@ -573,6 +576,25 @@ private StorageEngine createStore(String storeName, 
TaskName taskName, TaskModel
 return 
this.sideInputStorageManagers.values().stream().collect(Collectors.toSet());
   }
 
+  /**
+   * Registers any CSM created metrics such as side-inputs related metrics, 
and standby-task related metrics.
+   * @param metricsReporters metrics reporters to use
+   */
+  public void registerMetrics(Map metricsReporters) {
 
 Review comment:
   That's true, but that pattern doesn't work in this case because both the CSM 
and SamzaContainer need to have a different registry because they both create 
SystemConsumersMetrics. Using the same registry causes a conflict in the 
metrics' name. 
   
   We do use this pattern in TaskInstance, where each TaskInstance is given a 
set of reporters and it registers its metrics there.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] cameronlee314 commented on issue #940: SAMZA-2121: Add checkpoint offset field to IncomingMessageEnvelope

2019-03-07 Thread GitBox
cameronlee314 commented on issue #940: SAMZA-2121: Add checkpoint offset field 
to IncomingMessageEnvelope
URL: https://github.com/apache/samza/pull/940#issuecomment-470661443
 
 
   @rmatharu @mynameborat could you please take a look?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] cameronlee314 commented on issue #941: SAMZA-2120: Enable custom handling of ConsumerRecords consumed by Kafka

2019-03-07 Thread GitBox
cameronlee314 commented on issue #941: SAMZA-2120: Enable custom handling of 
ConsumerRecords consumed by Kafka 
URL: https://github.com/apache/samza/pull/941#issuecomment-470661609
 
 
   @rmatharu @mynameborat could you please take a look?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: [DISCUSS] Samza 1.1.0 release

2019-03-07 Thread Wei Song
+1

On 3/7/19, 9:23 AM, "Xinyu Liu"  wrote:

+1 (binding)

Thanks,
Xinyu

On Thu, Mar 7, 2019 at 12:43 AM santhosh venkat <
santhoshvenkat1...@gmail.com> wrote:

> +1 (non-binding)
>
> Thanks,
>
> On Wed, Mar 6, 2019 at 10:42 PM Yi Pan  wrote:
>
> > +1 (binding)
> >
> > On Wed, Mar 6, 2019 at 10:08 PM Daniel Chen  wrote:
> >
> > > Hello everyone,
> > >
> > > We have added couple of major features to master since 1.0.0 that
> > warrants
> > > a major release.
> > >
> > > Within LinkedIn, some of these features have already been tested as
> part
> > of
> > > our test suites. We plan to continue our testing in coming weeks to
> > > validate the stability prior to release.
> > >
> > > Here is the highlighted list of features that are part of the new
> release
> > > (in chronological order)
> > > SAMZA-1981
> > > Consolidate table descriptors to samza-api
> > > SAMZA-1985
> > > Implement Startpoints model and StartpointManager
> > > SAMZA-1998
> > > Table API refactoring
> > > SAMZA-2012
> > > Add API for wiring an external context through to application
> processing
> > > code
> > > SAMZA-2041
> > > Add system descriptors for HDFS and Kinesis
> > > SAMZA-2043
> > > Consolidate ReadableTable and ReadWriteTable
> > > SAMZA-2106
> > > Samza App & Job Config Refactor
> > > SAMZA-2081
> > > Samza SQL : Type system for Samza SQL
> > >
> > > You can find a complete list of features here:
> > >
> > >
> >
> 
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fissues%2F%3Fjql%3Dproject%2520%253D%2520SAMZA%2520AND%2520resolution%2520%2520%253D%2520Fixed%2520%2520AND%2520(fixVersion%2520%253E%253D%25201.1%2520)%2520ORDER%2520BY%2520createdDate%2520%2520DESCdata=02%7C01%7Cwsong%40linkedin.com%7C3da5e940081f49ea99b508d6a3218b86%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636875761806389559sdata=IncsGb89ZEwOFMgonbMvJfyWY1szs%2FlGH1wO2gISquk%3Dreserved=0
> > >
> > > Here is my proposal on our release schedule and timelines.
> > >
> > >1. Cut a release version 1.1.0 from master
> > >2. Target a release vote on the week March 13th (next week)
> > >
> > > Thoughts?
> > >
> > > Thanks,
> > > Daniel
> > >
> >
>




Re: [DISCUSS] Samza 1.1.0 release

2019-03-07 Thread Prateek Maheshwari
Daniel, let's try to include the following change in the release as well.
SAMZA-2116: Make sendTo and sink operators non-terminal

Other than that, +1 (binding).

- Prateek



On Thu, Mar 7, 2019 at 9:22 AM Xinyu Liu  wrote:

> +1 (binding)
>
> Thanks,
> Xinyu
>
> On Thu, Mar 7, 2019 at 12:43 AM santhosh venkat <
> santhoshvenkat1...@gmail.com> wrote:
>
> > +1 (non-binding)
> >
> > Thanks,
> >
> > On Wed, Mar 6, 2019 at 10:42 PM Yi Pan  wrote:
> >
> > > +1 (binding)
> > >
> > > On Wed, Mar 6, 2019 at 10:08 PM Daniel Chen  wrote:
> > >
> > > > Hello everyone,
> > > >
> > > > We have added couple of major features to master since 1.0.0 that
> > > warrants
> > > > a major release.
> > > >
> > > > Within LinkedIn, some of these features have already been tested as
> > part
> > > of
> > > > our test suites. We plan to continue our testing in coming weeks to
> > > > validate the stability prior to release.
> > > >
> > > > Here is the highlighted list of features that are part of the new
> > release
> > > > (in chronological order)
> > > > SAMZA-1981
> > > > Consolidate table descriptors to samza-api
> > > > SAMZA-1985
> > > > Implement Startpoints model and StartpointManager
> > > > SAMZA-1998
> > > > Table API refactoring
> > > > SAMZA-2012
> > > > Add API for wiring an external context through to application
> > processing
> > > > code
> > > > SAMZA-2041
> > > > Add system descriptors for HDFS and Kinesis
> > > > SAMZA-2043
> > > > Consolidate ReadableTable and ReadWriteTable
> > > > SAMZA-2106
> > > > Samza App & Job Config Refactor
> > > > SAMZA-2081
> > > > Samza SQL : Type system for Samza SQL
> > > >
> > > > You can find a complete list of features here:
> > > >
> > > >
> > >
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fissues%2F%3Fjql%3Dproject%2520%253D%2520SAMZA%2520AND%2520resolution%2520%2520%253D%2520Fixed%2520%2520AND%2520(fixVersion%2520%253E%253D%25201.1%2520)%2520ORDER%2520BY%2520createdDate%2520%2520DESCdata=02%7C01%7Cdchen1%40linkedin.com%7C01251a7438ea4324f3f608d6a2c11a53%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636875347611087937sdata=ZDMaQj5vX6Vlm%2B8vpGhrNygxpI2vvNnYGi1USWe%2FD5A%3Dreserved=0
> > > >
> > > > Here is my proposal on our release schedule and timelines.
> > > >
> > > >1. Cut a release version 1.1.0 from master
> > > >2. Target a release vote on the week March 13th (next week)
> > > >
> > > > Thoughts?
> > > >
> > > > Thanks,
> > > > Daniel
> > > >
> > >
> >
>


[GitHub] [samza] prateekm commented on a change in pull request #942: Bugfix: Recent CSM refactor was causing some metrics to not be emitted. Fixed -restore-time metric.

2019-03-07 Thread GitBox
prateekm commented on a change in pull request #942: Bugfix: Recent CSM 
refactor was causing some metrics to not be emitted. Fixed -restore-time 
metric.
URL: https://github.com/apache/samza/pull/942#discussion_r263506752
 
 

 ##
 File path: 
samza-core/src/main/scala/org/apache/samza/storage/ContainerStorageManager.java
 ##
 @@ -573,6 +576,25 @@ private StorageEngine createStore(String storeName, 
TaskName taskName, TaskModel
 return 
this.sideInputStorageManagers.values().stream().collect(Collectors.toSet());
   }
 
+  /**
+   * Registers any CSM created metrics such as side-inputs related metrics, 
and standby-task related metrics.
+   * @param metricsReporters metrics reporters to use
+   */
+  public void registerMetrics(Map metricsReporters) {
 
 Review comment:
   I don't think this is the standard pattern. The right way would be to pass 
in the shared MetricsRegistry (created in SamzaContainer) to the CSM, and 
register metrics on that in CSM#register(). That MetricsRegistry is already 
registered with the MetricsReporters in SamzaContainer. Does this make sense?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] prateekm commented on a change in pull request #942: Bugfix: Recent CSM refactor was causing some metrics to not be emitted. Fixed -restore-time metric.

2019-03-07 Thread GitBox
prateekm commented on a change in pull request #942: Bugfix: Recent CSM 
refactor was causing some metrics to not be emitted. Fixed -restore-time 
metric.
URL: https://github.com/apache/samza/pull/942#discussion_r263505840
 
 

 ##
 File path: 
samza-core/src/main/scala/org/apache/samza/system/SystemConsumersMetrics.scala
 ##
 @@ -19,12 +19,10 @@
 
 package org.apache.samza.system
 
-import org.apache.samza.metrics.MetricsRegistry
-import org.apache.samza.metrics.MetricsRegistryMap
-import org.apache.samza.metrics.Counter
-import org.apache.samza.metrics.MetricsHelper
+import org.apache.samza.metrics._
 
 Review comment:
   Minor: Prefer explicit imports.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: [DISCUSS] Samza 1.1.0 release

2019-03-07 Thread Xinyu Liu
+1 (binding)

Thanks,
Xinyu

On Thu, Mar 7, 2019 at 12:43 AM santhosh venkat <
santhoshvenkat1...@gmail.com> wrote:

> +1 (non-binding)
>
> Thanks,
>
> On Wed, Mar 6, 2019 at 10:42 PM Yi Pan  wrote:
>
> > +1 (binding)
> >
> > On Wed, Mar 6, 2019 at 10:08 PM Daniel Chen  wrote:
> >
> > > Hello everyone,
> > >
> > > We have added couple of major features to master since 1.0.0 that
> > warrants
> > > a major release.
> > >
> > > Within LinkedIn, some of these features have already been tested as
> part
> > of
> > > our test suites. We plan to continue our testing in coming weeks to
> > > validate the stability prior to release.
> > >
> > > Here is the highlighted list of features that are part of the new
> release
> > > (in chronological order)
> > > SAMZA-1981
> > > Consolidate table descriptors to samza-api
> > > SAMZA-1985
> > > Implement Startpoints model and StartpointManager
> > > SAMZA-1998
> > > Table API refactoring
> > > SAMZA-2012
> > > Add API for wiring an external context through to application
> processing
> > > code
> > > SAMZA-2041
> > > Add system descriptors for HDFS and Kinesis
> > > SAMZA-2043
> > > Consolidate ReadableTable and ReadWriteTable
> > > SAMZA-2106
> > > Samza App & Job Config Refactor
> > > SAMZA-2081
> > > Samza SQL : Type system for Samza SQL
> > >
> > > You can find a complete list of features here:
> > >
> > >
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fissues%2F%3Fjql%3Dproject%2520%253D%2520SAMZA%2520AND%2520resolution%2520%2520%253D%2520Fixed%2520%2520AND%2520(fixVersion%2520%253E%253D%25201.1%2520)%2520ORDER%2520BY%2520createdDate%2520%2520DESCdata=02%7C01%7Cdchen1%40linkedin.com%7C01251a7438ea4324f3f608d6a2c11a53%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636875347611087937sdata=ZDMaQj5vX6Vlm%2B8vpGhrNygxpI2vvNnYGi1USWe%2FD5A%3Dreserved=0
> > >
> > > Here is my proposal on our release schedule and timelines.
> > >
> > >1. Cut a release version 1.1.0 from master
> > >2. Target a release vote on the week March 13th (next week)
> > >
> > > Thoughts?
> > >
> > > Thanks,
> > > Daniel
> > >
> >
>


Re: Trouble running samza 0.14 in standalone mode

2019-03-07 Thread Anoop Krishnakumar
Hi Yi,

Apologies for my ignorance. I did not realize that attachments wont make
through and gist is the preferred method of sharing logs and code snippets.

Issue is resolved. I was using the default task.name.grouper.factory
instead of GroupByContainerIdsFactory.
Appreciate and thanks for the response. I will consider using 1.0 version.

-anoop


On Thu, 7 Mar 2019 at 01:41, Yi Pan  wrote:

> Hi, Anoop,
>
> 1. Please provide the full log file if possible. Just listing out a single
> log line reporting the failure does not help.
> 2. Is there any reason that you still stay with Samza 0.14? I would highly
> recommend to upgrade to 1.0 since there are tons of API and standalone
> related improvements went in to that version after 0.14.
>
> -Yi
>
> On Wed, Mar 6, 2019 at 1:54 PM Anoop Krishnakumar <
> anoop.krishnaku...@gmail.com> wrote:
>
> > I am prototyping an application where I have two input topics and two
> > output topics. The application also uses a rocksdb state store that is
> > backed by kafka.
> >
> > I followed WikipediaZkLocalApplication example, but the application
> always
> > exits without listening for messages from topic. I could see the
> execution
> > plan and job coordinator communicating with zookeeper. I could see the
> > following line in logs
> > StreamProcessor [INFO] Container is not instantiated for stream
> processor:
> > 5f7e2a46-fb7a-4054-bf5c-c62423ca2e35.
> > I could not figure out why the container wouldn't start. Any help would
> be
> > appreciated.
> >
> > Attached is the log and the job configuration.
> >
> > -anoop
> >
>


Re: [DISCUSS] Samza 1.1.0 release

2019-03-07 Thread santhosh venkat
+1 (non-binding)

Thanks,

On Wed, Mar 6, 2019 at 10:42 PM Yi Pan  wrote:

> +1 (binding)
>
> On Wed, Mar 6, 2019 at 10:08 PM Daniel Chen  wrote:
>
> > Hello everyone,
> >
> > We have added couple of major features to master since 1.0.0 that
> warrants
> > a major release.
> >
> > Within LinkedIn, some of these features have already been tested as part
> of
> > our test suites. We plan to continue our testing in coming weeks to
> > validate the stability prior to release.
> >
> > Here is the highlighted list of features that are part of the new release
> > (in chronological order)
> > SAMZA-1981
> > Consolidate table descriptors to samza-api
> > SAMZA-1985
> > Implement Startpoints model and StartpointManager
> > SAMZA-1998
> > Table API refactoring
> > SAMZA-2012
> > Add API for wiring an external context through to application processing
> > code
> > SAMZA-2041
> > Add system descriptors for HDFS and Kinesis
> > SAMZA-2043
> > Consolidate ReadableTable and ReadWriteTable
> > SAMZA-2106
> > Samza App & Job Config Refactor
> > SAMZA-2081
> > Samza SQL : Type system for Samza SQL
> >
> > You can find a complete list of features here:
> >
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fissues%2F%3Fjql%3Dproject%2520%253D%2520SAMZA%2520AND%2520resolution%2520%2520%253D%2520Fixed%2520%2520AND%2520(fixVersion%2520%253E%253D%25201.1%2520)%2520ORDER%2520BY%2520createdDate%2520%2520DESCdata=02%7C01%7Cdchen1%40linkedin.com%7C01251a7438ea4324f3f608d6a2c11a53%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636875347611087937sdata=ZDMaQj5vX6Vlm%2B8vpGhrNygxpI2vvNnYGi1USWe%2FD5A%3Dreserved=0
> >
> > Here is my proposal on our release schedule and timelines.
> >
> >1. Cut a release version 1.1.0 from master
> >2. Target a release vote on the week March 13th (next week)
> >
> > Thoughts?
> >
> > Thanks,
> > Daniel
> >
>