[jira] [Comment Edited] (STORM-592) Update stats.clj rolling-window-set function, exchange the real argument num-buckets and s of rolling-window function

2014-12-15 Thread zhangjinlong (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246416#comment-14246416
 ] 

zhangjinlong edited comment on STORM-592 at 12/15/14 8:33 AM:
--

exchange the real argument num-buckets and s of rolling-window function

https://github.com/BuDongDong/storm/commit/69af505fa002550e952129372ed88a298cea7fea


was (Author: zhangjinlong):
exchange the real argument num-buckets and s of rolling-window function

https://github.com/BuDongDong/storm/commit/785cda7a97877a25dac6fe96648f17ea42309ed7

 Update stats.clj rolling-window-set function, exchange the real argument 
 num-buckets and s of rolling-window function
 -

 Key: STORM-592
 URL: https://issues.apache.org/jira/browse/STORM-592
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.3-rc2
Reporter: zhangjinlong
Assignee: zhangjinlong

 (defn rolling-window-set [updater merger extractor num-buckets  bucket-sizes]
   (RollingWindowSet. updater extractor (dofor [s bucket-sizes] 
 (rolling-window updater merger extractor s num-buckets)) nil)
   )
 (defrecord RollingWindow [updater merger extractor bucket-size-secs 
 num-buckets buckets]) 
 if not exchange the real argument ”num-buckets“ and s of “rolling-window” 
 function, then the bucket-size-secs of RollingWindow is 30/540/4320, and 
 the num-buckets of RollingWindow is 20
 I think that the bucket-size-secs of RollingWindow is 20, and the 
 num-buckets of RollingWindow is 30/540/4320.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (STORM-592) Update stats.clj rolling-window-set function, exchange the real argument num-buckets and s of rolling-window function

2014-12-15 Thread zhangjinlong (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246413#comment-14246413
 ] 

zhangjinlong edited comment on STORM-592 at 12/15/14 8:33 AM:
--

(defrecord RollingWindow [updater merger extractor bucket-size-secs num-buckets 
buckets])

I think that the bucket-size-secs of RollingWindow is 20, and the 
num-buckets of RollingWindow is 30/540/4320. if not exchange the real 
argument ”num-buckets“ and s of “rolling-window” function, then the 
bucket-size-secs of RollingWindow is 30/540/4320, and the num-buckets of 
RollingWindow is 20.

creating CommonStats  under not exchange the real argument ”num-buckets“ and 
s of “rolling-window” function: 

(def NUM-STAT-BUCKETS 20)
;; 10 minutes, 3 hours, 1 day
(def STAT-BUCKETS [30 540 4320])

(defn- mk-common-stats
  [rate]
  (CommonStats.
(atom (apply keyed-counter-rolling-window-set NUM-STAT-BUCKETS 
STAT-BUCKETS))
(atom (apply keyed-counter-rolling-window-set NUM-STAT-BUCKETS 
STAT-BUCKETS))
rate))

mk-common-stats call  keyed-counter-rolling-window-set
==(apply keyed-counter-rolling-window-set 20 [30 540 4320])

(defn keyed-counter-rolling-window-set
  [num-buckets  bucket-sizes]
  (apply rolling-window-set incr-val (partial merge-with +) counter-extract 
num-buckets bucket-sizes))

keyed-counter-rolling-window-set call rolling-window-set
==(apply rolling-window-set incr-val (partial merge-with +) counter-extract 20 
[30 540 4320])

(defn rolling-window-set [updater merger extractor num-buckets  bucket-sizes]
  (RollingWindowSet. updater extractor (dofor [s bucket-sizes] (rolling-window 
updater merger extractor s num-buckets)) nil)
  )

rolling-window-set call constructor of RollingWindowSet
==(RollingWindowSet. updater extractor (dofor [s [30 540 4320]] 
(rolling-window updater merger extractor s 20)) nil)

constructor of RollingWindowSet call rolling-window
==(rolling-window updater merger extractor 30 20)
==(rolling-window updater merger extractor 540 20)
==(rolling-window updater merger extractor 4320 20)

(defn rolling-window
  [updater merger extractor bucket-size-secs num-buckets]
  (RollingWindow. updater merger extractor bucket-size-secs num-buckets {}))

rolling-window call constructor of RollingWindow
==(RollingWindow. updater merger extractor 30 20 {}) 
==(RollingWindow. updater merger extractor 540 20 {}) 
==(RollingWindow. updater merger extractor 4320 20 {})

if not exchange the real argument ”num-buckets“ and s of “rolling-window” 
function, then the bucket-size-secs of RollingWindow is 30/540/4320, and the 
num-buckets of RollingWindow is 20. 

creating CommonStats  under exchange the real argument ”num-buckets“ and s of 
“rolling-window” function: 

==(RollingWindow. updater merger extractor 20 30 {}) 
==(RollingWindow. updater merger extractor 20 540 {}) 
==(RollingWindow. updater merger extractor 20 4320 {})

I think the bucket-size-secs should represent the size of bucket not the 
count of bucket; the ”num-buckets“ should represent the count of bucket not the 
size of bucket. so it is necessary to exchange the real argument ”num-buckets“ 
and s of “rolling-window” function

https://github.com/BuDongDong/storm/commit/69af505fa002550e952129372ed88a298cea7fea




was (Author: zhangjinlong):
(defrecord RollingWindow [updater merger extractor bucket-size-secs num-buckets 
buckets])

I think that the bucket-size-secs of RollingWindow is 20, and the 
num-buckets of RollingWindow is 30/540/4320. if not exchange the real 
argument ”num-buckets“ and s of “rolling-window” function, then the 
bucket-size-secs of RollingWindow is 30/540/4320, and the num-buckets of 
RollingWindow is 20.

creating CommonStats  under not exchange the real argument ”num-buckets“ and 
s of “rolling-window” function: 

(def NUM-STAT-BUCKETS 20)
;; 10 minutes, 3 hours, 1 day
(def STAT-BUCKETS [30 540 4320])

(defn- mk-common-stats
  [rate]
  (CommonStats.
(atom (apply keyed-counter-rolling-window-set NUM-STAT-BUCKETS 
STAT-BUCKETS))
(atom (apply keyed-counter-rolling-window-set NUM-STAT-BUCKETS 
STAT-BUCKETS))
rate))

mk-common-stats call  keyed-counter-rolling-window-set
==(apply keyed-counter-rolling-window-set 20 [30 540 4320])

(defn keyed-counter-rolling-window-set
  [num-buckets  bucket-sizes]
  (apply rolling-window-set incr-val (partial merge-with +) counter-extract 
num-buckets bucket-sizes))

keyed-counter-rolling-window-set call rolling-window-set
==(apply rolling-window-set incr-val (partial merge-with +) counter-extract 20 
[30 540 4320])

(defn rolling-window-set [updater merger extractor num-buckets  bucket-sizes]
  (RollingWindowSet. updater extractor (dofor [s bucket-sizes] (rolling-window 
updater merger extractor s num-buckets)) nil)
  )

rolling-window-set call constructor of RollingWindowSet
==(RollingWindowSet. updater extractor (dofor [s [30 540 4320]] 
(rolling-window updater 

[GitHub] storm pull request: Update stats.clj rolling-window-set function...

2014-12-15 Thread BuDongDong
GitHub user BuDongDong opened a pull request:

https://github.com/apache/storm/pull/348

Update stats.clj rolling-window-set function, exchange the real argume...

...nt num-buckets and s of rolling-window function

(defn rolling-window-set [updater merger extractor num-buckets  
bucket-sizes]
(RollingWindowSet. updater extractor (dofor [s bucket-sizes] 
(rolling-window updater merger extractor s num-buckets)) nil)
)
(defrecord RollingWindow [updater merger extractor bucket-size-secs 
num-buckets buckets])
if not exchange the real argument ”num-buckets“ and s of 
“rolling-window” function, then the bucket-size-secs of RollingWindow is 
30/540/4320, and the num-buckets of RollingWindow is 20
I think that the bucket-size-secs of RollingWindow is 20, and the 
num-buckets of RollingWindow is 30/540/4320.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/BuDongDong/storm-1 master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/348.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #348


commit b827d660bf18f711da36012355c7d06e35b926e3
Author: zhangjinlong zhangjinlong0...@126.com
Date:   2014-12-15T08:50:50Z

Update stats.clj rolling-window-set function, exchange the real argument 
num-buckets and s of rolling-window function

(defn rolling-window-set [updater merger extractor num-buckets  
bucket-sizes]
(RollingWindowSet. updater extractor (dofor [s bucket-sizes] 
(rolling-window updater merger extractor s num-buckets)) nil)
)
(defrecord RollingWindow [updater merger extractor bucket-size-secs 
num-buckets buckets])
if not exchange the real argument ”num-buckets“ and s of 
“rolling-window” function, then the bucket-size-secs of RollingWindow is 
30/540/4320, and the num-buckets of RollingWindow is 20
I think that the bucket-size-secs of RollingWindow is 20, and the 
num-buckets of RollingWindow is 30/540/4320.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request: Update stats.clj rolling-window-set function...

2014-12-15 Thread BuDongDong
Github user BuDongDong commented on the pull request:

https://github.com/apache/storm/pull/348#issuecomment-66964691
  
(defrecord RollingWindow [updater merger extractor bucket-size-secs 
num-buckets buckets])

I think that the bucket-size-secs of RollingWindow is 20, and the 
num-buckets of RollingWindow is 30/540/4320. if not exchange the real 
argument ”num-buckets“ and s of “rolling-window” function, then the 
bucket-size-secs of RollingWindow is 30/540/4320, and the num-buckets of 
RollingWindow is 20.

creating CommonStats  under not exchange the real argument 
”num-buckets“ and s of “rolling-window” function: 

(def NUM-STAT-BUCKETS 20)
;; 10 minutes, 3 hours, 1 day
(def STAT-BUCKETS [30 540 4320])

(defn- mk-common-stats
  [rate]
  (CommonStats.
(atom (apply keyed-counter-rolling-window-set NUM-STAT-BUCKETS 
STAT-BUCKETS))
(atom (apply keyed-counter-rolling-window-set NUM-STAT-BUCKETS 
STAT-BUCKETS))
rate))

mk-common-stats call  keyed-counter-rolling-window-set
==(apply keyed-counter-rolling-window-set 20 [30 540 4320])

(defn keyed-counter-rolling-window-set
  [num-buckets  bucket-sizes]
  (apply rolling-window-set incr-val (partial merge-with +) counter-extract 
num-buckets bucket-sizes))

keyed-counter-rolling-window-set call rolling-window-set
==(apply rolling-window-set incr-val (partial merge-with +) 
counter-extract 20 [30 540 4320])

(defn rolling-window-set [updater merger extractor num-buckets  
bucket-sizes]
  (RollingWindowSet. updater extractor (dofor [s bucket-sizes] 
(rolling-window updater merger extractor s num-buckets)) nil)
  )

rolling-window-set call constructor of RollingWindowSet
==(RollingWindowSet. updater extractor (dofor [s [30 540 4320]] 
(rolling-window updater merger extractor s 20)) nil)

constructor of RollingWindowSet call rolling-window
==(rolling-window updater merger extractor 30 20)
==(rolling-window updater merger extractor 540 20)
==(rolling-window updater merger extractor 4320 20)

(defn rolling-window
  [updater merger extractor bucket-size-secs num-buckets]
  (RollingWindow. updater merger extractor bucket-size-secs num-buckets {}))

rolling-window call constructor of RollingWindow
==(RollingWindow. updater merger extractor 30 20 {}) 
==(RollingWindow. updater merger extractor 540 20 {}) 
==(RollingWindow. updater merger extractor 4320 20 {})

if not exchange the real argument ”num-buckets“ and s of 
“rolling-window” function, then the bucket-size-secs of RollingWindow is 
30/540/4320, and the num-buckets of RollingWindow is 20. 

creating CommonStats  under exchange the real argument ”num-buckets“ 
and s of “rolling-window” function: 

==(RollingWindow. updater merger extractor 20 30 {}) 
==(RollingWindow. updater merger extractor 20 540 {}) 
==(RollingWindow. updater merger extractor 20 4320 {})

I think the bucket-size-secs should represent the size of bucket not the 
count of bucket; the ”num-buckets“ should represent the count of bucket not 
the size of bucket. so it is necessary to exchange the real argument 
”num-buckets“ and s of “rolling-window” function


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Comment Edited] (STORM-592) Update stats.clj rolling-window-set function, exchange the real argument num-buckets and s of rolling-window function

2014-12-15 Thread zhangjinlong (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246413#comment-14246413
 ] 

zhangjinlong edited comment on STORM-592 at 12/15/14 9:05 AM:
--

(defrecord RollingWindow [updater merger extractor bucket-size-secs num-buckets 
buckets])

I think that the bucket-size-secs of RollingWindow is 20, and the 
num-buckets of RollingWindow is 30/540/4320. if not exchange the real 
argument ”num-buckets“ and s of “rolling-window” function, then the 
bucket-size-secs of RollingWindow is 30/540/4320, and the num-buckets of 
RollingWindow is 20.

creating CommonStats  under not exchange the real argument ”num-buckets“ and 
s of “rolling-window” function: 

(def NUM-STAT-BUCKETS 20)
;; 10 minutes, 3 hours, 1 day
(def STAT-BUCKETS [30 540 4320])

(defn- mk-common-stats
  [rate]
  (CommonStats.
(atom (apply keyed-counter-rolling-window-set NUM-STAT-BUCKETS 
STAT-BUCKETS))
(atom (apply keyed-counter-rolling-window-set NUM-STAT-BUCKETS 
STAT-BUCKETS))
rate))

mk-common-stats call  keyed-counter-rolling-window-set
==(apply keyed-counter-rolling-window-set 20 [30 540 4320])

(defn keyed-counter-rolling-window-set
  [num-buckets  bucket-sizes]
  (apply rolling-window-set incr-val (partial merge-with +) counter-extract 
num-buckets bucket-sizes))

keyed-counter-rolling-window-set call rolling-window-set
==(apply rolling-window-set incr-val (partial merge-with +) counter-extract 20 
[30 540 4320])

(defn rolling-window-set [updater merger extractor num-buckets  bucket-sizes]
  (RollingWindowSet. updater extractor (dofor [s bucket-sizes] (rolling-window 
updater merger extractor s num-buckets)) nil)
  )

rolling-window-set call constructor of RollingWindowSet
==(RollingWindowSet. updater extractor (dofor [s [30 540 4320]] 
(rolling-window updater merger extractor s 20)) nil)

constructor of RollingWindowSet call rolling-window
==(rolling-window updater merger extractor 30 20)
==(rolling-window updater merger extractor 540 20)
==(rolling-window updater merger extractor 4320 20)

(defn rolling-window
  [updater merger extractor bucket-size-secs num-buckets]
  (RollingWindow. updater merger extractor bucket-size-secs num-buckets {}))

rolling-window call constructor of RollingWindow
==(RollingWindow. updater merger extractor 30 20 {}) 
==(RollingWindow. updater merger extractor 540 20 {}) 
==(RollingWindow. updater merger extractor 4320 20 {})

if not exchange the real argument ”num-buckets“ and s of “rolling-window” 
function, then the bucket-size-secs of RollingWindow is 30/540/4320, and the 
num-buckets of RollingWindow is 20. 

creating CommonStats  under exchange the real argument ”num-buckets“ and s of 
“rolling-window” function: 

==(RollingWindow. updater merger extractor 20 30 {}) 
==(RollingWindow. updater merger extractor 20 540 {}) 
==(RollingWindow. updater merger extractor 20 4320 {})

I think the bucket-size-secs should represent the size of bucket not the 
count of bucket; the ”num-buckets“ should represent the count of bucket not the 
size of bucket. so it is necessary to exchange the real argument ”num-buckets“ 
and s of “rolling-window” function

https://github.com/apache/storm/pull/348




was (Author: zhangjinlong):
(defrecord RollingWindow [updater merger extractor bucket-size-secs num-buckets 
buckets])

I think that the bucket-size-secs of RollingWindow is 20, and the 
num-buckets of RollingWindow is 30/540/4320. if not exchange the real 
argument ”num-buckets“ and s of “rolling-window” function, then the 
bucket-size-secs of RollingWindow is 30/540/4320, and the num-buckets of 
RollingWindow is 20.

creating CommonStats  under not exchange the real argument ”num-buckets“ and 
s of “rolling-window” function: 

(def NUM-STAT-BUCKETS 20)
;; 10 minutes, 3 hours, 1 day
(def STAT-BUCKETS [30 540 4320])

(defn- mk-common-stats
  [rate]
  (CommonStats.
(atom (apply keyed-counter-rolling-window-set NUM-STAT-BUCKETS 
STAT-BUCKETS))
(atom (apply keyed-counter-rolling-window-set NUM-STAT-BUCKETS 
STAT-BUCKETS))
rate))

mk-common-stats call  keyed-counter-rolling-window-set
==(apply keyed-counter-rolling-window-set 20 [30 540 4320])

(defn keyed-counter-rolling-window-set
  [num-buckets  bucket-sizes]
  (apply rolling-window-set incr-val (partial merge-with +) counter-extract 
num-buckets bucket-sizes))

keyed-counter-rolling-window-set call rolling-window-set
==(apply rolling-window-set incr-val (partial merge-with +) counter-extract 20 
[30 540 4320])

(defn rolling-window-set [updater merger extractor num-buckets  bucket-sizes]
  (RollingWindowSet. updater extractor (dofor [s bucket-sizes] (rolling-window 
updater merger extractor s num-buckets)) nil)
  )

rolling-window-set call constructor of RollingWindowSet
==(RollingWindowSet. updater extractor (dofor [s [30 540 4320]] 
(rolling-window updater merger extractor s 20)) nil)

constructor of 

[jira] [Updated] (STORM-329) Add Option to Config Message handling strategy when connection timeout

2014-12-15 Thread xiajun (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiajun updated STORM-329:
-
Fix Version/s: (was: 0.9.3-rc2)

 Add Option to Config Message handling strategy when connection timeout
 --

 Key: STORM-329
 URL: https://issues.apache.org/jira/browse/STORM-329
 Project: Apache Storm
  Issue Type: Improvement
Affects Versions: 0.9.2-incubating
Reporter: Sean Zhong
Priority: Minor
  Labels: Netty
 Attachments: storm-329.patch, worker-kill-recover3.jpg


 This is to address a [concern brought 
 up|https://github.com/apache/incubator-storm/pull/103#issuecomment-43632986] 
 during the work at STORM-297:
 {quote}
 [~revans2] wrote: Your logic makes since to me on why these calls are 
 blocking. My biggest concern around the blocking is in the case of a worker 
 crashing. If a single worker crashes this can block the entire topology from 
 executing until that worker comes back up. In some cases I can see that being 
 something that you would want. In other cases I can see speed being the 
 primary concern and some users would like to get partial data fast, rather 
 then accurate data later.
 Could we make it configurable on a follow up JIRA where we can have a max 
 limit to the buffering that is allowed, before we block, or throw data away 
 (which is what zeromq does)?
 {quote}
 If some worker crash suddenly, how to handle the message which was supposed 
 to be delivered to the worker?
 1. Should we buffer all message infinitely?
 2. Should we block the message sending until the connection is resumed?
 3. Should we config a buffer limit, try to buffer the message first, if the 
 limit is met, then block?
 4. Should we neither block, nor buffer too much, but choose to drop the 
 messages, and use the built-in storm failover mechanism? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-593: remove endpoint-socket-lock for wor...

2014-12-15 Thread tedxia
GitHub user tedxia opened a pull request:

https://github.com/apache/storm/pull/349

STORM-593: remove endpoint-socket-lock for worker-data

PR for [STORM-593](https://issues.apache.org/jira/browse/STORM-593)

cached-node+port-socket in worker-data is atom, there on need for rwlock 
endpoint-socket-lock to protect cached-node+port-socket. And after use rwlock, 
there will be competition between refresh-connections and message send.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tedxia/incubator-storm 
remove-endpoint-socket-lock

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/349.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #349


commit 5210fce329571da2b6121c8ba941d3c7774face1
Author: xiajun xia...@xiaomi.com
Date:   2014-12-15T09:56:53Z

remove endpoint-socket-lock in worker.clj




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-593) No need of rwlock for clojure atom

2014-12-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246512#comment-14246512
 ] 

ASF GitHub Bot commented on STORM-593:
--

GitHub user tedxia opened a pull request:

https://github.com/apache/storm/pull/349

STORM-593: remove endpoint-socket-lock for worker-data

PR for [STORM-593](https://issues.apache.org/jira/browse/STORM-593)

cached-node+port-socket in worker-data is atom, there on need for rwlock 
endpoint-socket-lock to protect cached-node+port-socket. And after use rwlock, 
there will be competition between refresh-connections and message send.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tedxia/incubator-storm 
remove-endpoint-socket-lock

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/349.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #349


commit 5210fce329571da2b6121c8ba941d3c7774face1
Author: xiajun xia...@xiaomi.com
Date:   2014-12-15T09:56:53Z

remove endpoint-socket-lock in worker.clj




 No need of rwlock for clojure atom 
 ---

 Key: STORM-593
 URL: https://issues.apache.org/jira/browse/STORM-593
 Project: Apache Storm
  Issue Type: Improvement
Affects Versions: 0.10.0
Reporter: xiajun
Assignee: xiajun

 cached-node+port-socket in worker-data is atom, there on need for rwlock 
 endpoint-socket-lock to protect cached-node+port-socket. And after use 
 rwlock, there will be competition between refresh-connections and message 
 send.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

2014-12-15 Thread nielsbasjes
Github user nielsbasjes commented on the pull request:

https://github.com/apache/storm/pull/275#issuecomment-66971975
  
@nathanmarz Thanks I understand your view now. 
I refactored my patch to meet this requirement and I've tried to make it as 
small a difference as possible.

This set of commits should really be squashed before committing to the 
master.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-512) KafkaBolt doesn't handle ticks properly

2014-12-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246513#comment-14246513
 ] 

ASF GitHub Bot commented on STORM-512:
--

Github user nielsbasjes commented on the pull request:

https://github.com/apache/storm/pull/275#issuecomment-66971975
  
@nathanmarz Thanks I understand your view now. 
I refactored my patch to meet this requirement and I've tried to make it as 
small a difference as possible.

This set of commits should really be squashed before committing to the 
master.



 KafkaBolt doesn't handle ticks properly
 ---

 Key: STORM-512
 URL: https://issues.apache.org/jira/browse/STORM-512
 Project: Apache Storm
  Issue Type: Bug
Reporter: Niels Basjes

 I found that when using the KafkaBolt the tick tuples are not handled 
 properly. They should be ignored and the reality is that they are not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-593: remove endpoint-socket-lock for wor...

2014-12-15 Thread nathanmarz
Github user nathanmarz commented on the pull request:

https://github.com/apache/storm/pull/349#issuecomment-66975846
  
-1. That r/w lock ensures that send is never called on a closed connection. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-593) No need of rwlock for clojure atom

2014-12-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246533#comment-14246533
 ] 

ASF GitHub Bot commented on STORM-593:
--

Github user nathanmarz commented on the pull request:

https://github.com/apache/storm/pull/349#issuecomment-66975846
  
-1. That r/w lock ensures that send is never called on a closed connection. 


 No need of rwlock for clojure atom 
 ---

 Key: STORM-593
 URL: https://issues.apache.org/jira/browse/STORM-593
 Project: Apache Storm
  Issue Type: Improvement
Affects Versions: 0.10.0
Reporter: xiajun
Assignee: xiajun

 cached-node+port-socket in worker-data is atom, there on need for rwlock 
 endpoint-socket-lock to protect cached-node+port-socket. And after use 
 rwlock, there will be competition between refresh-connections and message 
 send.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-593: remove endpoint-socket-lock for wor...

2014-12-15 Thread tedxia
Github user tedxia commented on the pull request:

https://github.com/apache/storm/pull/349#issuecomment-66996013
  
The write lock only protect cached-task-node+port, and not protect 
cached-node+port-socket. 
@nathanmarz If we want to ensure send never called on a closed connection, 
should we also protect cached-node+port-socket either?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request: Update stats.clj rolling-window-set function...

2014-12-15 Thread BuDongDong
Github user BuDongDong commented on the pull request:

https://github.com/apache/storm/pull/348#issuecomment-67004693
  
@nathanmarz Could you check this bug, thank you very much


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request: Added support for serialization to SequenceFil...

2014-12-15 Thread harshach
Github user harshach commented on the pull request:

https://github.com/apache/storm/pull/347#issuecomment-67024196
  
@mikert  Could you please open a JIRA for this. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (STORM-594) Auto-Scaling Resources in a Topology

2014-12-15 Thread HARSHA BALASUBRAMANIAN (JIRA)
HARSHA BALASUBRAMANIAN created STORM-594:


 Summary: Auto-Scaling Resources in a Topology
 Key: STORM-594
 URL: https://issues.apache.org/jira/browse/STORM-594
 Project: Apache Storm
  Issue Type: New Feature
Reporter: HARSHA BALASUBRAMANIAN
Assignee: HARSHA BALASUBRAMANIAN
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (STORM-594) Auto-Scaling Resources in a Topology

2014-12-15 Thread HARSHA BALASUBRAMANIAN (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HARSHA BALASUBRAMANIAN updated STORM-594:
-
Attachment: Project Plan and Scope.pdf
Algorithm for Auto-Scaling.pdf

Files which describe this project

 Auto-Scaling Resources in a Topology
 

 Key: STORM-594
 URL: https://issues.apache.org/jira/browse/STORM-594
 Project: Apache Storm
  Issue Type: New Feature
Reporter: HARSHA BALASUBRAMANIAN
Assignee: HARSHA BALASUBRAMANIAN
Priority: Minor
 Attachments: Algorithm for Auto-Scaling.pdf, Project Plan and 
 Scope.pdf

   Original Estimate: 504h
  Remaining Estimate: 504h





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: Added support for serialization to SequenceFil...

2014-12-15 Thread Parth-Brahmbhatt
Github user Parth-Brahmbhatt commented on a diff in the pull request:

https://github.com/apache/storm/pull/347#discussion_r21839632
  
--- Diff: 
external/storm-hdfs/src/main/java/org/apache/storm/hdfs/bolt/SequenceFileBolt.java
 ---
@@ -108,7 +109,11 @@ public void execute(Tuple tuple) {
 try {
 long offset;
 synchronized (this.writeLock) {
-this.writer.append(this.format.key(tuple), 
this.format.value(tuple));
+if (this.format instanceof SerializableSequenceFormat) {
--- End diff --

The new interface and type checking is only necessary for backwards 
compatibility, you could change the return type of original SequenceFormat 
interface to Object and no other code change will be necessary . I would add 
that as a comment here so when we decide to do a major version bump we could 
get rid of this extra code and the extra interface.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Fwd: Auto-Scaling

2014-12-15 Thread Harsha Balasubramanian
Hi Bobby,

  Hope this email finds you well. I.m about to start designing the
auto-scaling system. Created a JIRA for it,
https://issues.apache.org/jira/browse/STORM-594

Please take a look and let me know if there are any concerns.

I have 2 questions for you;
1. In one of your earlier emails, you mentioned that there is a
*limited *Metrics
implementation which is available in the current Storm version.
Will this provide 'acks threshold'?

2. Is it possible to use the CLI tool from a local cluster?

Thanks,
Harsha

*Contact Details: *
Harsha Balasubramanian
Graduate Student at University of Toronto
Masters in Computer Science

Mobile : 647-771-3596
Email 1: harsha...@cs.toronto.edu
Email 2: harsha...@gmail.com
LinkedIn : ca.linkedin.com/in/harshabala/

On Wed, Nov 12, 2014 at 6:28 PM, Harsha Balasubramanian harsha...@gmail.com
 wrote:

 Thanks for the detailed explanation Bobby. I will keep this in mind when I
 create my design document.

 I will mostly not deal with adding/removing bolts to the topology and
 restrict myself to threads and tasks. This is because of the time I have to
 complete the project. Once I submit my report (early January), I can extend
 what I have implemented to more use cases.

 Thanks,
 Harsha

 Thanks,
 Harsha

 *Contact Details: *
 Harsha Balasubramanian
 Graduate Student at University of Toronto
 Professional Masters in Computer Science

 Email 1: harsha...@cs.toronto.edu
 Email 2: harsha...@gmail.com
 LinkedIn : ca.linkedin.com/in/harshabala/

 On Wed, Nov 12, 2014 at 6:05 PM, Bobby Evans ev...@yahoo-inc.com.invalid
 wrote:

 Sure,
 The main thing that storm is missing right now is an metrics feedback
 system to the scheduler (or possibly a separate piece of code that decides
 when to grow/shrink a topology).  We have some basic metrics, but they are
 not exposed to the scheduler.  The other question is how do we deal with
 creating/destroying new bolts, especially around dealing with different
 groupings.  Some groupings make it difficult.  There a number of ways to
 get around this, but I think the simplest way is to not create/destroy
 instances, but to treat it as a resources problem, and give them more or
 less resources as needed.  But that still needs to be discussed. - Bobby


  On Wednesday, November 12, 2014 3:10 PM, Harsha Balasubramanian 
 harsha...@gmail.com wrote:


  Hi Bobby,

   Thanks for getting back to me. It is encouraging to hear that this will
 be a good addition to Storm if done well.

   I have not implemented the changes yet. My project just started. It
 spans
 over the next 2 months. In a few days, I will create a JIRA task and
 submit
 my proposal. It would be great to brainstorm with the Storm community and
 iron out my design.


 Thanks,
 Harsha

 Thanks,
 Harsha

 *Contact Details: *
 Harsha Balasubramanian
 Graduate Student at University of Toronto
 Professional Masters in Computer Science

 Email 1: harsha...@cs.toronto.edu
 Email 2: harsha...@gmail.com
 LinkedIn : ca.linkedin.com/in/harshabala/

 On Wed, Nov 12, 2014 at 3:47 PM, Bobby Evans ev...@yahoo-inc.com.invalid
 
 wrote:

  Yes, this type of a feature would be great to have.  I am rally curious
  how you have done the changes, as there are a lot of potential pitfalls
  here.  At a minimum it would just be great to have the feedback
 framework
  in place so we can iterate on these changes.
   - Bobby
 
 
   On Wednesday, November 12, 2014 2:05 PM, Harsha st...@harsha.io
  wrote:
 
 
   Hi,
 It will  definitely interesting to the storm community. It will be
 great if you can create a JIRA and post your code as PR for others
 to try out and review the code.
  Thanks,
  Harsha
 
  On Wed, Nov 12, 2014, at 11:16 AM, Harsha Balasubramanian wrote:
   Please let me know if my project (outlined below) will be useful to
   Storm.
  
   Thanks,
   Harsha
  
   *Contact Details: *
   Harsha Balasubramanian
   Graduate Student at University of Toronto
   Professional Masters in Computer Science
  
   Email 1: harsha...@cs.toronto.edu
   Email 2: harsha...@gmail.com
   LinkedIn : ca.linkedin.com/in/harshabala/
  
   -- Forwarded message --
   From: Harsha Balasubramanian harsha...@gmail.com
   Date: Tue, Nov 11, 2014 at 8:15 PM
   Subject: Auto-Scaling
   To: dev@storm.apache.org
  
  
   Hi,
  
I am a Graduate student at the University of Toronto. As part of my
   Advanced Database Systems course, I have proposed to implement an Auto
   Scaling mechanism for Storm topologies (using a Feedback System).
  
   I've gone through the pages on how to contribute to the Storm project
 and
   have some questions. Please let me know if auto-scaling is being
 worked
   on
   currently. Also, should this be a project in StormCore or
 StormProcessor
   ?
  
   If this project will be a good addition to Storm, I will create a new
   JIRA
   task for it and add the details of my proposed implementation. Please
 do
   let me know.
  
   Thanks,
   Harsha
  
   *Contact 

[GitHub] storm pull request: STORM-329 : buffer message in client and recon...

2014-12-15 Thread clockfly
Github user clockfly commented on the pull request:

https://github.com/apache/storm/pull/268#issuecomment-67047625
  
@nathanmarz ,

I'd like to explain why I need to change worker.clj.

This was also motivated by a legacy TODO in in zmq.clj. 

https://github.com/nathanmarz/storm/blob/0.8.2/src/clj/backtype/storm/messaging/zmq.clj#L43
```
  (send [this task message]
...
(mq/send socket message ZMQ/NOBLOCK)) ;; TODO: how to do backpressure 
if doing noblock?... need to only unblock if the target disappears
```
As we can see, zeromq transport will send message in non-blocking way. 

If I understand this TODO correctly, it wants,
a) When target worker is not booted yet, the source worker should not send 
message to target. Otherwise, as there is no backpressure, there will be 
message loss during the bootup phase. If it is un unacked topology, the message 
loss is permanent; if it is an acked topology, we will need to do unnecessary 
replay. 
b) When target worker disappears in the middle(crash?), the source worker 
should drop the messages directly.

The problem is that: transport layer don't know by itself whether the 
target worker is booting up or crashed in the running phase, so it cannot 
smartly choose between back pressure or drop.

If the transport simplifiy choose block, it is good for booting up 
phase, but bad for running phase. If one connection is down, it may block 
messages sent to other connections.
If the transport simplify choose drop, it is good for running phase, 
but bad for booting up phase. If the target worker is booted 30 seconds 
later, all message between this 30 seconds will be dropped. 

The changes in worker.clj is targeted to solve this problem.
Worker knows when the target worker connections are ready.
In the bootup phase, worker.clj will wait target worker connection is 
ready, then it will activate the source worker tasks.
In the “runtime phase, the transport will simply drop the messages if 
target worker crashed in the middle.

There will be several benefits:
1. During cluster bootup, for unacked topology, there will be no strange 
message loss.
2. During cluster bootup, for acked topology, it can take less time to 
reach the normal throughput, as there is no message loss, timeout, and replay.
3. For transport layer, the design is simplified. We can just drop the 
messages if target worker is not available. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-329) Add Option to Config Message handling strategy when connection timeout

2014-12-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14247065#comment-14247065
 ] 

ASF GitHub Bot commented on STORM-329:
--

Github user clockfly commented on the pull request:

https://github.com/apache/storm/pull/268#issuecomment-67047625
  
@nathanmarz ,

I'd like to explain why I need to change worker.clj.

This was also motivated by a legacy TODO in in zmq.clj. 

https://github.com/nathanmarz/storm/blob/0.8.2/src/clj/backtype/storm/messaging/zmq.clj#L43
```
  (send [this task message]
...
(mq/send socket message ZMQ/NOBLOCK)) ;; TODO: how to do backpressure 
if doing noblock?... need to only unblock if the target disappears
```
As we can see, zeromq transport will send message in non-blocking way. 

If I understand this TODO correctly, it wants,
a) When target worker is not booted yet, the source worker should not send 
message to target. Otherwise, as there is no backpressure, there will be 
message loss during the bootup phase. If it is un unacked topology, the message 
loss is permanent; if it is an acked topology, we will need to do unnecessary 
replay. 
b) When target worker disappears in the middle(crash?), the source worker 
should drop the messages directly.

The problem is that: transport layer don't know by itself whether the 
target worker is booting up or crashed in the running phase, so it cannot 
smartly choose between back pressure or drop.

If the transport simplifiy choose block, it is good for booting up 
phase, but bad for running phase. If one connection is down, it may block 
messages sent to other connections.
If the transport simplify choose drop, it is good for running phase, 
but bad for booting up phase. If the target worker is booted 30 seconds 
later, all message between this 30 seconds will be dropped. 

The changes in worker.clj is targeted to solve this problem.
Worker knows when the target worker connections are ready.
In the bootup phase, worker.clj will wait target worker connection is 
ready, then it will activate the source worker tasks.
In the “runtime phase, the transport will simply drop the messages if 
target worker crashed in the middle.

There will be several benefits:
1. During cluster bootup, for unacked topology, there will be no strange 
message loss.
2. During cluster bootup, for acked topology, it can take less time to 
reach the normal throughput, as there is no message loss, timeout, and replay.
3. For transport layer, the design is simplified. We can just drop the 
messages if target worker is not available. 


 Add Option to Config Message handling strategy when connection timeout
 --

 Key: STORM-329
 URL: https://issues.apache.org/jira/browse/STORM-329
 Project: Apache Storm
  Issue Type: Improvement
Affects Versions: 0.9.2-incubating
Reporter: Sean Zhong
Priority: Minor
  Labels: Netty
 Attachments: storm-329.patch, worker-kill-recover3.jpg


 This is to address a [concern brought 
 up|https://github.com/apache/incubator-storm/pull/103#issuecomment-43632986] 
 during the work at STORM-297:
 {quote}
 [~revans2] wrote: Your logic makes since to me on why these calls are 
 blocking. My biggest concern around the blocking is in the case of a worker 
 crashing. If a single worker crashes this can block the entire topology from 
 executing until that worker comes back up. In some cases I can see that being 
 something that you would want. In other cases I can see speed being the 
 primary concern and some users would like to get partial data fast, rather 
 then accurate data later.
 Could we make it configurable on a follow up JIRA where we can have a max 
 limit to the buffering that is allowed, before we block, or throw data away 
 (which is what zeromq does)?
 {quote}
 If some worker crash suddenly, how to handle the message which was supposed 
 to be delivered to the worker?
 1. Should we buffer all message infinitely?
 2. Should we block the message sending until the connection is resumed?
 3. Should we config a buffer limit, try to buffer the message first, if the 
 limit is met, then block?
 4. Should we neither block, nor buffer too much, but choose to drop the 
 messages, and use the built-in storm failover mechanism? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (STORM-594) Auto-Scaling Resources in a Topology

2014-12-15 Thread HARSHA BALASUBRAMANIAN (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HARSHA BALASUBRAMANIAN updated STORM-594:
-
Description: 
A useful feature missing in Storm topologies is the ability to auto-scale 
resources, based on a pre-configured metric. The feature proposed here aims to 
build such a auto-scaling mechanism using a feedback system. A brief overview 
of the feature is provided here. The finer details of the required components 
and the scaling algorithm (uses a Feedback System) are provided in the PDFs 
attached.

Brief Overview:
Topologies may get created with or (ideally) without parallelism hints and 
tasks in their bolts and spouts, before submitting them, If auto-scaling is set 
in the topology (using a Boolean flag), the topology will also get submitted to 
the auto-scale module.
The auto-scale module will read a pre-configured metric (threshold/min) from a 
configuration file. Using this value, the topology's resources will be modified 
till the threshold is reached. At each stage in the auto-scale module's 
execution, feedback from the previous execution will be used to tune the 
resources.

The systems that need to be in place to achieve this are:
1. Metrics which provide the current threshold (no: of acks per minute) for a 
topology's spouts and bolts.
2. Access to Storm's CLI tool which can change a topology's resources are 
runtime.
3. A new java or clojure module which runs within the Nimbus daemon or in 
parallel to it. This will be the auto-scale module.

Limitations: (This is not an exhaustive list. More will be added as the design 
matures. Also, some of the points here may get resolved)
To test the feature there will be a number of limitations in the first release. 
As the feature matures, it will be allowed to scale more
1. The auto-scale module will be limited to a few topologies (maybe 4 or 5 at 
maximum)
2. New bolts will not be added to scale a topology. This feature will be 
limited to increasing the resources within the existing topology.
3. Topology resources will not be decreased when it is running at more than the 
required number (except for a few cases)
4. This feature will work only for long-running topologies where the input 
threshold can become equal to or greater than the required threshold



  was:
A useful feature missing in Storm topologies is the ability to auto-scale 
resources, based on a pre-configured metric. The feature proposed here aims to 
build such a auto-scaling mechanism using a feedback system. A brief overview 
of the feature is provided here. The finer details of the required components 
and the scaling algorithm (uses a Feedback System) are provided in the PDFs 
attached.

Brief Overview:
Topologies may get created with or (ideally) without parallelism hints and 
tasks in their bolts and spouts, before submitting them, If auto-scaling is set 
in the topology (using a Boolean flag), the topology will also get submitted to 
the auto-scale module.
The auto-scale module will read a pre-configured metric (threshold/min) from a 
configuration file. Using this value, the topology's resources will be modified 
till the threshold is reached. At each stage in the auto-scale module's 
execution, feedback from the previous execution will be used to tune the 
resources.

The systems that need to be in place to achieve this are:
1. Metrics which provide the current threshold (no: of acks per minute) for a 
topology's spouts and bolts.
2. Access to Storm's CLI tool which can change a topology's resources are 
runtime.
3. A new java or clojure module which runs within the Nimbus daemon or in 
parallel to it. This will be the auto-scale module.

Limitations: (This is not an exhaustive list. More will be added as the design 
matures. Also, some of the points here may get resolved)
To test the feature there will be a number of limitations in the first release. 
As the feature matures, it will be allowed to scale more
1. The auto-scale module will be limited to a few topologies (maybe 4 or 5 at 
maximum)
2. New bolts will not be added to scale a topology. This feature will be 
limited to increasing the resources within the existing topology.
3. Topology resources will not be decreased when it is running at more than the 
required number (except for a few cases)
4. This feature will not work for topologies where the input is much lesser 
than the required threshold




 Auto-Scaling Resources in a Topology
 

 Key: STORM-594
 URL: https://issues.apache.org/jira/browse/STORM-594
 Project: Apache Storm
  Issue Type: New Feature
Reporter: HARSHA BALASUBRAMANIAN
Assignee: HARSHA BALASUBRAMANIAN
Priority: Minor
 Attachments: Algorithm for Auto-Scaling.pdf, Project Plan and 
 Scope.pdf

   Original Estimate: 504h
  Remaining Estimate: 504h

[GitHub] storm pull request: Storm-539. Storm hive bolt and trident state.

2014-12-15 Thread harshach
GitHub user harshach opened a pull request:

https://github.com/apache/storm/pull/350

Storm-539. Storm hive bolt and trident state.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/harshach/incubator-storm STORM-539-V2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/350.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #350


commit 01ab7b141c408809d6491556494bded0d436ba4d
Author: Sriharsha Chintalapani m...@harsha.io
Date:   2014-12-15T22:24:51Z

STORM-539. Storm hive bolt and trident state.

commit dfb8e3709d27691a2f97bbc3e49491f13a9769d1
Author: Sriharsha Chintalapani m...@harsha.io
Date:   2014-12-15T23:23:30Z

STORM-539. Storm hive bolt and trident state. Hive version to 0.14 and
update README.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request: Added support for serialization to SequenceFil...

2014-12-15 Thread mikert
Github user mikert commented on the pull request:

https://github.com/apache/storm/pull/347#issuecomment-67100672
  
@harshach Sure thing. Can you provide feedback about Parth-Brahmbhatt's 
comment? I did it that way to ensure that code written for 0.9.3 wouldn't be 
immediately broken in 0.10.0. However, I can easily make the change since it 
probably would be better.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (STORM-595) storm-hdfs can only work with sequence files that use Writables

2014-12-15 Thread Mike Thomsen (JIRA)
Mike Thomsen created STORM-595:
--

 Summary: storm-hdfs can only work with sequence files that use 
Writables
 Key: STORM-595
 URL: https://issues.apache.org/jira/browse/STORM-595
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.3
Reporter: Mike Thomsen
Assignee: Mike Thomsen
Priority: Minor


The current SequenceFormat interface requires that key() and value() return a 
class that implements Writable. This limitation makes it impossible to use 
object serialization systems like Avro or even Java serialization with the HDFS 
SequenceFileBolt.

Proposed solution: change SequenceFormat so that key() and value() return an 
Object, not a Writable. This would keep existing functionality for those 
implementing Writable support while allowing serialization support for those of 
us that need it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-586: TridentKafkaEmitter should catch up...

2014-12-15 Thread harshach
Github user harshach commented on a diff in the pull request:

https://github.com/apache/storm/pull/339#discussion_r21874150
  
--- Diff: external/storm-kafka/src/jvm/storm/kafka/KafkaUtils.java ---
@@ -155,7 +155,7 @@ public void refreshPartitions(SetPartition 
partitions) {
 }
 }
 
-public static ByteBufferMessageSet fetchMessages(KafkaConfig config, 
SimpleConsumer consumer, Partition partition, long offset) throws 
UpdateOffsetException {
+public static ByteBufferMessageSet fetchMessages(KafkaConfig config, 
SimpleConsumer consumer, Partition partition, long offset) throws 
TopicOffsetOutOfRangeException {
--- End diff --

Sorry I should've mention this earlier but can you add 
FailedFetchException, RuntimeException to the throws class there. fetchMessages 
throws above two  apart from TopicOffsetOutOfRangeException but we only declare 
one of the exception.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-586) Trident kafka spout fails instead of updating offset when kafka offset is out of range.

2014-12-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14247618#comment-14247618
 ] 

ASF GitHub Bot commented on STORM-586:
--

Github user harshach commented on a diff in the pull request:

https://github.com/apache/storm/pull/339#discussion_r21874150
  
--- Diff: external/storm-kafka/src/jvm/storm/kafka/KafkaUtils.java ---
@@ -155,7 +155,7 @@ public void refreshPartitions(SetPartition 
partitions) {
 }
 }
 
-public static ByteBufferMessageSet fetchMessages(KafkaConfig config, 
SimpleConsumer consumer, Partition partition, long offset) throws 
UpdateOffsetException {
+public static ByteBufferMessageSet fetchMessages(KafkaConfig config, 
SimpleConsumer consumer, Partition partition, long offset) throws 
TopicOffsetOutOfRangeException {
--- End diff --

Sorry I should've mention this earlier but can you add 
FailedFetchException, RuntimeException to the throws class there. fetchMessages 
throws above two  apart from TopicOffsetOutOfRangeException but we only declare 
one of the exception.


 Trident kafka spout fails instead of updating offset when kafka offset is out 
 of range.
 ---

 Key: STORM-586
 URL: https://issues.apache.org/jira/browse/STORM-586
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.3
Reporter: Parth Brahmbhatt
Assignee: Parth Brahmbhatt
Priority: Critical

 Trident KafkaEmitter does not catch the newly added UpdateOffsetException 
 which results in the spout failing repeatedly instead of automatically 
 updating the offset to earliest time. 
 PROBLEM: 
 Exception while using the Trident Kafka Spout.
 2014-12-04 18:38:03 b.s.util ERROR Async loop died! 
 java.lang.RuntimeException: storm.kafka.UpdateOffsetException 
 at 
 backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:107)
  ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 
 at 
 backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:78)
  ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 
 at 
 backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:77)
  ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 
 at 
 backtype.storm.daemon.executor$fn_4195$fn4207$fn_4254.invoke(executor.clj:745)
  ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 
 at backtype.storm.util$async_loop$fn__442.invoke(util.clj:436) 
 ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 
 at clojure.lang.AFn.run(AFn.java:24) clojure-1.4.0.jar:na 
 at java.lang.Thread.run(Thread.java:745) na:1.7.0_71 
 Caused by: storm.kafka.UpdateOffsetException: null 
 at storm.kafka.KafkaUtils.fetchMessages(KafkaUtils.java:186) ~stormjar.jar:na 
 at 
 storm.kafka.trident.TridentKafkaEmitter.fetchMessages(TridentKafkaEmitter.java:132)
  ~stormjar.jar:na 
 at 
 storm.kafka.trident.TridentKafkaEmitter.doEmitNewPartitionBatch(TridentKafkaEmitter.java:113)
  ~stormjar.jar:na 
 at 
 storm.kafka.trident.TridentKafkaEmitter.failFastEmitNewPartitionBatch(TridentKafkaEmitter.java:72)
  ~stormjar.jar:na 
 at 
 storm.kafka.trident.TridentKafkaEmitter.access$400(TridentKafkaEmitter.java:46)
  ~stormjar.jar:na 
 at 
 storm.kafka.trident.TridentKafkaEmitter$2.emitPartitionBatchNew(TridentKafkaEmitter.java:233)
  ~stormjar.jar:na 
 at 
 storm.kafka.trident.TridentKafkaEmitter$2.emitPartitionBatchNew(TridentKafkaEmitter.java:225)
  ~stormjar.jar:na 
 at 
 storm.trident.spout.PartitionedTridentSpoutExecutor$Emitter$1.init(PartitionedTridentSpoutExecutor.java:125)
  ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 
 at 
 storm.trident.topology.state.RotatingTransactionalState.getState(RotatingTransactionalState.java:83)
  ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 
 at 
 storm.trident.topology.state.RotatingTransactionalState.getStateOrCreate(RotatingTransactionalState.java:110)
  ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 
 at 
 storm.trident.spout.PartitionedTridentSpoutExecutor$Emitter.emitBatch(PartitionedTridentSpoutExecutor.java:121)
  ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 
 at 
 storm.trident.spout.TridentSpoutExecutor.execute(TridentSpoutExecutor.java:82)
  ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 
 at 
 storm.trident.topology.TridentBoltExecutor.execute(TridentBoltExecutor.java:369)
  ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 
 at 
 backtype.storm.daemon.executor$fn_4195$tuple_action_fn_4197.invoke(executor.clj:630)
  ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 
 at 
 backtype.storm.daemon.executor$mk_task_receiver$fn__4118.invoke(executor.clj:398)
  ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 
 at 
 

[GitHub] storm pull request: update PartitionManager.java

2014-12-15 Thread harshach
Github user harshach commented on the pull request:

https://github.com/apache/storm/pull/331#issuecomment-67103228
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request: [STORM-557] Created docs directory and added i...

2014-12-15 Thread harshach
Github user harshach commented on the pull request:

https://github.com/apache/storm/pull/314#issuecomment-67103354
  
@revans2  can you please upmerge your patch. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-557) High Quality Images for presentations, etc.

2014-12-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14247631#comment-14247631
 ] 

ASF GitHub Bot commented on STORM-557:
--

Github user harshach commented on the pull request:

https://github.com/apache/storm/pull/314#issuecomment-67103354
  
@revans2  can you please upmerge your patch. Thanks.


 High Quality Images for presentations, etc.
 ---

 Key: STORM-557
 URL: https://issues.apache.org/jira/browse/STORM-557
 Project: Apache Storm
  Issue Type: Documentation
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans

 Recently I created a couple of svg diagrams for a poster I was doing about 
 secure storm.  I thought it was a complete waste to not release them as open 
 source, but I wasn't totally sure where we wanted to keep them.  I can check 
 them into git, because that would make it simple for others to find, and we 
 could link to them from the markdown documentation. But I could also put them 
 in the svn repo for the official site documentation.
 I'll throw up a very basic pull request and see what people think.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: typo

2014-12-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/storm/pull/333


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request: STORM-548:Receive Thread Shutdown hook should ...

2014-12-15 Thread harshach
Github user harshach commented on the pull request:

https://github.com/apache/storm/pull/307#issuecomment-67105369
  
@caofangkun  I am trying to merge this into master but looks like your PR 
contains additional file loader.clj~ . can you please remove and update the PR. 
Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request: STORM-548:Receive Thread Shutdown hook should ...

2014-12-15 Thread harshach
Github user harshach commented on the pull request:

https://github.com/apache/storm/pull/307#issuecomment-67111034
  
@caofangkun Thanks for the quick fix. I merged the PR. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-548) Receive Thread Shutdown hook should connect to local hostname but not localhost

2014-12-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14247725#comment-14247725
 ] 

ASF GitHub Bot commented on STORM-548:
--

Github user asfgit closed the pull request at:

https://github.com/apache/storm/pull/307


 Receive Thread Shutdown hook should connect to local hostname but not 
 localhost 
 --

 Key: STORM-548
 URL: https://issues.apache.org/jira/browse/STORM-548
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.3-rc2
Reporter: caofangkun
Priority: Minor

 backtype.storm.messaging.loader#launch-receive-thread!
 kill-socket should connect to local hostname but not localhost
 See Code Line 72:
 https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/messaging/loader.clj#L72
 {code:title=loader.clj|borderStyle=solid}
 Index: src/clj/backtype/storm/messaging/loader.clj
 ===
 --- src/clj/backtype/storm/messaging/loader.clj   (revision 4017)
 +++ src/clj/backtype/storm/messaging/loader.clj   (working copy)
 @@ -65,11 +65,12 @@
 :kill-fn (fn [t] (System/exit 1))
 :priority Thread/NORM_PRIORITY]
(let [max-buffer-size (int max-buffer-size)
 +local-hostname (memoized-local-hostname)
  socket (.bind ^IContext context storm-id port)
  thread-count (if receiver-thread-count receiver-thread-count 1)
  vthreads (mk-receive-threads context storm-id port transfer-local-fn 
 daemon kill-fn priority socket max-buffer-size thread-count)]
  (fn []
 -  (let [kill-socket (.connect ^IContext context storm-id localhost 
 port)]
 +  (let [kill-socket (.connect ^IContext context storm-id local-hostname 
 port)]
  (log-message Shutting down receiving-thread: [ storm-id ,  port 
 ])
  (.send ^IConnection kill-socket
-1 (byte-array []))
  {code}  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (STORM-548) Receive Thread Shutdown hook should connect to local hostname but not localhost

2014-12-15 Thread caofangkun (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

caofangkun reassigned STORM-548:


Assignee: caofangkun

 Receive Thread Shutdown hook should connect to local hostname but not 
 localhost 
 --

 Key: STORM-548
 URL: https://issues.apache.org/jira/browse/STORM-548
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.3-rc2
Reporter: caofangkun
Assignee: caofangkun
Priority: Minor
 Fix For: 0.10.0


 backtype.storm.messaging.loader#launch-receive-thread!
 kill-socket should connect to local hostname but not localhost
 See Code Line 72:
 https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/messaging/loader.clj#L72
 {code:title=loader.clj|borderStyle=solid}
 Index: src/clj/backtype/storm/messaging/loader.clj
 ===
 --- src/clj/backtype/storm/messaging/loader.clj   (revision 4017)
 +++ src/clj/backtype/storm/messaging/loader.clj   (working copy)
 @@ -65,11 +65,12 @@
 :kill-fn (fn [t] (System/exit 1))
 :priority Thread/NORM_PRIORITY]
(let [max-buffer-size (int max-buffer-size)
 +local-hostname (memoized-local-hostname)
  socket (.bind ^IContext context storm-id port)
  thread-count (if receiver-thread-count receiver-thread-count 1)
  vthreads (mk-receive-threads context storm-id port transfer-local-fn 
 daemon kill-fn priority socket max-buffer-size thread-count)]
  (fn []
 -  (let [kill-socket (.connect ^IContext context storm-id localhost 
 port)]
 +  (let [kill-socket (.connect ^IContext context storm-id local-hostname 
 port)]
  (log-message Shutting down receiving-thread: [ storm-id ,  port 
 ])
  (.send ^IConnection kill-socket
-1 (byte-array []))
  {code}  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: Update stats.clj rolling-window-set function...

2014-12-15 Thread BuDongDong
Github user BuDongDong closed the pull request at:

https://github.com/apache/storm/pull/348


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---