Jenkins build is back to normal : Geode-nightly-flaky #178

2017-11-21 Thread Apache Jenkins Server
See 




Build failed in Jenkins: Geode-nightly #1021

2017-11-21 Thread Apache Jenkins Server
See 


Changes:

[github] GEODE-3341: Convert DiskStoreCommandsDUnitTest to use gfsh rules 
(#1062)

[jiliao] GEODE-2676: fix NPE with ShowMetricsCommand.

[github] GEODE-3539: Add missing test coverage for 'list regions' and 'describe

[github] GEODE-3980: Remove unneeded additional findAvailablePids calls (#1076)

[github] GEODE-3999: Prevent prematurely running out of heap (#1078)

[bschuchardt] commit dade94b3b5a3a3b2178a62e31edab27ccca40aa8 Merge: 526bcfc 
73be2d9

[github] GEODE-2567: Add --if-exists to destroy disk-store (#1080)

[bschuchardt] GEODE-3995: Moving server_api.proto to locator_api.proto.

--
[...truncated 111.40 KB...]
at 
org.apache.geode.test.junit.rules.gfsh.GfshScript.execute(GfshScript.java:105)
at 
org.apache.geode.management.internal.cli.commands.StopServerWithSecurityAcceptanceTest.startCluster(StopServerWithSecurityAcceptanceTest.java:113)
at 
org.apache.geode.management.internal.cli.commands.StopServerWithSecurityAcceptanceTest.cannotStopServerAsClusterReaderOverJmx(StopServerWithSecurityAcceptanceTest.java:90)

org.apache.geode.management.internal.cli.commands.StopServerWithSecurityAcceptanceTest
 > cannotStopServerAsDataReaderOverHttp FAILED
org.junit.ComparisonFailure: expected:<[0]> but was:<[1]>
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at 
org.apache.geode.test.junit.rules.gfsh.GfshScript.awaitIfNecessary(GfshScript.java:116)
at 
org.apache.geode.test.junit.rules.gfsh.GfshRule.execute(GfshRule.java:94)
at 
org.apache.geode.test.junit.rules.gfsh.GfshScript.execute(GfshScript.java:105)
at 
org.apache.geode.management.internal.cli.commands.StopServerWithSecurityAcceptanceTest.startCluster(StopServerWithSecurityAcceptanceTest.java:113)
at 
org.apache.geode.management.internal.cli.commands.StopServerWithSecurityAcceptanceTest.cannotStopServerAsDataReaderOverHttp(StopServerWithSecurityAcceptanceTest.java:60)

org.apache.geode.management.internal.cli.commands.StopServerWithSecurityAcceptanceTest
 > cannotStopServerAsDataReaderOverJmx FAILED
org.junit.ComparisonFailure: expected:<[0]> but was:<[1]>
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at 
org.apache.geode.test.junit.rules.gfsh.GfshScript.awaitIfNecessary(GfshScript.java:116)
at 
org.apache.geode.test.junit.rules.gfsh.GfshRule.execute(GfshRule.java:94)
at 
org.apache.geode.test.junit.rules.gfsh.GfshScript.execute(GfshScript.java:105)
at 
org.apache.geode.management.internal.cli.commands.StopServerWithSecurityAcceptanceTest.startCluster(StopServerWithSecurityAcceptanceTest.java:113)
at 
org.apache.geode.management.internal.cli.commands.StopServerWithSecurityAcceptanceTest.cannotStopServerAsDataReaderOverJmx(StopServerWithSecurityAcceptanceTest.java:75)

org.apache.geode.management.internal.cli.commands.StopServerWithSecurityAcceptanceTest
 > canStopServerAsClusterAdminOverHttp FAILED
org.junit.ComparisonFailure: expected:<[0]> but was:<[1]>
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at 
org.apache.geode.test.junit.rules.gfsh.GfshScript.awaitIfNecessary(GfshScript.java:116)
at 
org.apache.geode.test.junit.rules.gfsh.GfshRule.execute(GfshRule.java:94)
at 
org.apache.geode.test.junit.rules.gfsh.GfshScript.execute(GfshScript.java:105)
at 
org.apache.geode.management.internal.cli.commands.StopServerWithSecurityAcceptanceTest.startCluster(StopServerWithSecurityAcceptanceTest.java:113)
at 
org.apache.geode.management.internal.cli.commands.StopServerWithSecurityAcceptanceTest.canStopServerAsClusterAdminOverHttp(StopServerWithSecurityAcceptanceTest.java:68)

org.apache.geode.management.internal.cli.commands.StopServerAcceptanceTest > 
canStopServerByNameWhenConnectedOverJmx FAILED
org.junit.ComparisonFailure: expected:<[0]> but was:<[1]>
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstr

Broken: apache/geode#4955 (develop - f429e9a)

2017-11-21 Thread Travis CI
Build Update for apache/geode
-

Build: #4955
Status: Broken

Duration: 21 minutes and 15 seconds
Commit: f429e9a (develop)
Author: Anton Mironenko
Message: GEODE-3038: A server process shuts down quietly when path to cache.xml 
is incorrect (#677)

Exception is thrown and logged when cache.xml is not found during cache creation

View the changeset: 
https://github.com/apache/geode/compare/938442323a05...f429e9a7eb5b

View the full build log and details: 
https://travis-ci.org/apache/geode/builds/305550630?utm_source=email&utm_medium=notification

--

You can configure recipients for build notifications in your .travis.yml file. 
See https://docs.travis-ci.com/user/notifications



[Spring CI] Spring Data GemFire > Nightly-ApacheGeode > #743 was SUCCESSFUL (with 2187 tests)

2017-11-21 Thread Spring CI

---
Spring Data GemFire > Nightly-ApacheGeode > #743 was successful.
---
Scheduled
2189 tests in total.

https://build.spring.io/browse/SGF-NAG-743/





--
This message is automatically generated by Atlassian Bamboo

Failed: jinmeiliao/geode#115 (alter - 91f219e)

2017-11-21 Thread Travis CI
Build Update for jinmeiliao/geode
-

Build: #115
Status: Failed

Duration: 20 minutes and 0 seconds
Commit: 91f219e (alter)
Author: Jinmei Liao
Message: GEODE-3788: add alter async-event-queue command and tests

View the changeset: https://github.com/jinmeiliao/geode/commit/91f219eb6b02

View the full build log and details: 
https://travis-ci.org/jinmeiliao/geode/builds/305409620?utm_source=email&utm_medium=notification

--

You can configure recipients for build notifications in your .travis.yml file. 
See https://docs.travis-ci.com/user/notifications



Re: [Discussion] Geode-Native Removing Stats from Public API

2017-11-21 Thread Vincent Ford
I would have some concerns with reducing the availability of stats or
devaluing the tooling for analysis. Although exposing more stats via JMX is
a great idea, most of the third party tooling I have seen is best for
monitoring and not post analysis. Not investing in jVSD seems like it would
make this post crash/issue analysis much more difficult in reviewing the
complex interactions among multiple distributed nodes. For example
correlating performance issues among 10 JVM's ( ie a GC in one node causing
performance hiccups detected in another node) and be extremely difficult
without using a tool that can consume and allow the overlay of multiple
graphs of metrics we are collecting from many sources.  The power here is
in the ability to correlate multiple events from multiple JVM's into a
single graphical view for debugging purposes and without this capability it
will be significantly more difficult to understand the complex distributed
behavior of Geode.

Currently custom stats are basically undocumented and difficult to use,
removing them from the public API will probably have little impact on  most
users. Most users that want to do some monitoring can use JMX for their
compentents but it would be helpful to have a method to add those values
into the same stream as other stats/metrics for post issue analysis.



*Vince Ford*
GemFire Toolsmith Engineering
Beaverton, OR USA
http://www.pivotal.io
Open Source Project Geode http://geode.apache.org/


On Mon, Nov 20, 2017 at 11:46 PM, Jacob Barrett  wrote:

> Couldn’t agree more for the Java side of things. The first step is
> deprecating the API for adding custom stats.
>
> > On Nov 20, 2017, at 11:13 PM, Swapnil Bawaskar 
> wrote:
> >
> > A lot of statistics we have are exposed over JMX. I think we should make
> an
> > effort to expose all the stats over JMX, making them consumable with
> > existing tooling rather than investing in jVSD.
> >
> >> On Mon, Nov 20, 2017 at 2:32 PM Addison Huddy 
> wrote:
> >>
> >> Thanks for the clarification Jake. So yes, I'm in agreement that we
> >> should simplify the C++ API and remove the stats API from the C++
> client.
> >>
> >> \ah
> >>
> >> On Mon, Nov 20, 2017 at 10:23 AM, Jacob Barrett 
> >> wrote:
> >>
> >>> To clarify, the goal here is to remove access from the public API to
> >> inject
> >>> custom stats into our stats stream. We would still collect stats for
> the
> >>> internals of the client.
> >>>
> >>> The rational is multifaceted:
> >>>
> >>> 1) The C++ API is would need a non-trivial amount of time to modernize.
> >> The
> >>> current API uses pointers of pointers for maintaining collections.
> There
> >> is
> >>> a strange cross relationship between components in the stats classes
> >> which
> >>> create unique pointer ownership questions. Rather than solving those
> now
> >>> and further dragging out the modernization of the C++ API we would drop
> >> it
> >>> and evaluated adding it back in a modern way in the future. Though I
> >>> suspect adding it back in the future will never happen for the reasons
> >>> below.
> >>>
> >>> 2) The storage format for our stats in proprietary to Geode and lacks
> >> wide
> >>> market adoption.
> >>>
> >>> 3) There are no modern tools for analyzing the statistics generated.
> >> Geode
> >>> lacks a tool for viewing or analyzing the statistics. Unless work is
> >>> prioritized on completing the jVSD application then users are forced to
> >>> write custom applications to extract the contents of the stats files.
> >>>
> >>> I support the removal from the Java public API for reasons 2 and 3.
> >> Unless
> >>> we put a full effort into creating the ecosystem around the stats
> format
> >> to
> >>> make it usable we should remove it from the public API. I strongly
> >>> encourage that we remove it internally too but that is for another
> >>> discussion.
> >>>
> >>> -Jake
> >>>
> >>>
>  On Mon, Nov 20, 2017 at 9:43 AM Michael Stolz 
> wrote:
> 
>  I'm not clear on why we are removing stats gathering capability.
>  Do we know that customers aren't using this?
>  Is it badly broken?
> 
>  What is actually driving this work?
> 
>  --
>  Mike Stolz
>  Principal Engineer, GemFire Product Lead
>  Mobile: +1-631-835-4771 <(631)%20835-4771> <(631)%20835-4771>
> 
>  On Mon, Nov 20, 2017 at 11:42 AM, Bruce Schuchardt <
> >>> bschucha...@pivotal.io
> >
>  wrote:
> 
> > Should this be done for the Java caches as well?
> >
> >
> >> On 11/17/17 11:48 AM, David Kimura wrote:
> >>
> >> I agree, a statistics interface seems beyond the scope of Geode
> >> Native
> >> client responsibility.  Hiding or removing seems appropriate to me.
> >>
> >> Thanks,
> >> David
> >>
> >> On Fri, Nov 17, 2017 at 11:29 AM, Ernest Burghardt
> >>  wrote:
> >>
> >>> +1 for removal
> >>>
> >>> On Thu, Nov 16, 2017 at 1:46 PM, Jacob Ba

RE: DiskStore exception while region data evicted

2017-11-21 Thread Guy Turkenits
+ Viki

From: Gregory Vortman
Sent: Tuesday, November 21, 2017 6:49 PM
To: u...@geode.apache.org; dev@geode.apache.org
Cc: *Technology - Digital - BSS – Charging - GEODE team 
; Guy Turkenits 
Subject: DiskStore exception while region data evicted

Hi team,
One of the grid members went down and entire cache is closed whenever Partition 
region got an LRU threshold and overflow to disk is started:


Disk-store defined with 40GB.

Actual metrics while crashed: entries on disk 70, bytes only on disk ~1GB.
There is much room in the File system.

Can you help to understand the following exception:

[severe 2017/11/21 15:41:05.678 IST host1-pwinfo1  tid=0xdc] Fatal error from asynchronous flusher 
thread
org.apache.geode.InternalGemFireError: Bucket 
BucketRegion[path='/__PR/_B__EXTERNAL__RECORDS__1_171;serial=6025;primary=true] 
size (-1425) negative after applying delta of -2401
at 
org.apache.geode.internal.cache.BucketRegion.updateBucketMemoryStats(BucketRegion.java:2291)
at 
org.apache.geode.internal.cache.BucketRegion.updateBucket2Size(BucketRegion.java:2279)
at 
org.apache.geode.internal.cache.BucketRegion.updateSizeOnEvict(BucketRegion.java:2157)
at 
org.apache.geode.internal.cache.DiskEntry$Helper.writeEntryToDisk(DiskEntry.java:1441)
at 
org.apache.geode.internal.cache.DiskEntry$Helper.doAsyncFlush(DiskEntry.java:1388)
at 
org.apache.geode.internal.cache.DiskStoreImpl$FlusherThread.run(DiskStoreImpl.java:1729)
at java.lang.Thread.run(Thread.java:748)

[error 2017/11/21 15:41:05.679 IST host1-pwinfo1  tid=0xdc] A DiskAccessException has occurred 
while writing to the disk for disk sto
re ExternalRecord-overflow. The cache will be closed.
org.apache.geode.cache.DiskAccessException: For DiskStore: 
ExternalRecord-overflow: Fatal error from asynchronous flusher thread, caused 
by org.apache.geode.InternalGemFireError: Bucket BucketRegion
[path='/__PR/_B__EXTERNAL__RECORDS__1_171;serial=6025;primary=true] size 
(-1425) negative after applying delta of -2401
at 
org.apache.geode.internal.cache.DiskStoreImpl$FlusherThread.run(DiskStoreImpl.java:1774)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.geode.InternalGemFireError: Bucket 
BucketRegion[path='/__PR/_B__EXTERNAL__RECORDS__1_171;serial=6025;primary=true] 
size (-1425) negative after applying delta of -2401
at 
org.apache.geode.internal.cache.BucketRegion.updateBucketMemoryStats(BucketRegion.java:2291)
at 
org.apache.geode.internal.cache.BucketRegion.updateBucket2Size(BucketRegion.java:2279)
at 
org.apache.geode.internal.cache.BucketRegion.updateSizeOnEvict(BucketRegion.java:2157)
at 
org.apache.geode.internal.cache.DiskEntry$Helper.writeEntryToDisk(DiskEntry.java:1441)
at 
org.apache.geode.internal.cache.DiskEntry$Helper.doAsyncFlush(DiskEntry.java:1388)
at 
org.apache.geode.internal.cache.DiskStoreImpl$FlusherThread.run(DiskStoreImpl.java:1729)
... 1 more

Thanks

Gregory Vortman


This message and the information contained herein is proprietary and 
confidential and subject to the Amdocs policy statement,

you may review at https://www.amdocs.com/about/email-disclaimer 



DiskStore exception while region data evicted

2017-11-21 Thread Gregory Vortman
Hi team,
One of the grid members went down and entire cache is closed whenever Partition 
region got an LRU threshold and overflow to disk is started:


Disk-store defined with 40GB.

Actual metrics while crashed: entries on disk 70, bytes only on disk ~1GB.
There is much room in the File system.

Can you help to understand the following exception:

[severe 2017/11/21 15:41:05.678 IST host1-pwinfo1  tid=0xdc] Fatal error from asynchronous flusher 
thread
org.apache.geode.InternalGemFireError: Bucket 
BucketRegion[path='/__PR/_B__EXTERNAL__RECORDS__1_171;serial=6025;primary=true] 
size (-1425) negative after applying delta of -2401
at 
org.apache.geode.internal.cache.BucketRegion.updateBucketMemoryStats(BucketRegion.java:2291)
at 
org.apache.geode.internal.cache.BucketRegion.updateBucket2Size(BucketRegion.java:2279)
at 
org.apache.geode.internal.cache.BucketRegion.updateSizeOnEvict(BucketRegion.java:2157)
at 
org.apache.geode.internal.cache.DiskEntry$Helper.writeEntryToDisk(DiskEntry.java:1441)
at 
org.apache.geode.internal.cache.DiskEntry$Helper.doAsyncFlush(DiskEntry.java:1388)
at 
org.apache.geode.internal.cache.DiskStoreImpl$FlusherThread.run(DiskStoreImpl.java:1729)
at java.lang.Thread.run(Thread.java:748)

[error 2017/11/21 15:41:05.679 IST host1-pwinfo1  tid=0xdc] A DiskAccessException has occurred 
while writing to the disk for disk sto
re ExternalRecord-overflow. The cache will be closed.
org.apache.geode.cache.DiskAccessException: For DiskStore: 
ExternalRecord-overflow: Fatal error from asynchronous flusher thread, caused 
by org.apache.geode.InternalGemFireError: Bucket BucketRegion
[path='/__PR/_B__EXTERNAL__RECORDS__1_171;serial=6025;primary=true] size 
(-1425) negative after applying delta of -2401
at 
org.apache.geode.internal.cache.DiskStoreImpl$FlusherThread.run(DiskStoreImpl.java:1774)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.geode.InternalGemFireError: Bucket 
BucketRegion[path='/__PR/_B__EXTERNAL__RECORDS__1_171;serial=6025;primary=true] 
size (-1425) negative after applying delta of -2401
at 
org.apache.geode.internal.cache.BucketRegion.updateBucketMemoryStats(BucketRegion.java:2291)
at 
org.apache.geode.internal.cache.BucketRegion.updateBucket2Size(BucketRegion.java:2279)
at 
org.apache.geode.internal.cache.BucketRegion.updateSizeOnEvict(BucketRegion.java:2157)
at 
org.apache.geode.internal.cache.DiskEntry$Helper.writeEntryToDisk(DiskEntry.java:1441)
at 
org.apache.geode.internal.cache.DiskEntry$Helper.doAsyncFlush(DiskEntry.java:1388)
at 
org.apache.geode.internal.cache.DiskStoreImpl$FlusherThread.run(DiskStoreImpl.java:1729)
... 1 more

Thanks

Gregory Vortman


This message and the information contained herein is proprietary and 
confidential and subject to the Amdocs policy statement,

you may review at https://www.amdocs.com/about/email-disclaimer