[jira] [Commented] (GEODE-10) HDFS Integration
[ https://issues.apache.org/jira/browse/GEODE-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630825#comment-14630825 ] ASF subversion and git services commented on GEODE-10: -- Commit a1abe19001105a69f5826ba697ad45588befa635 in incubator-geode's branch refs/heads/develop from Ashvin Agrawal [ https://git-wip-us.apache.org/repos/asf?p=incubator-geode.git;h=a1abe19 ] Merge branch 'feature/GEODE-10' into develop HDFS Integration Key: GEODE-10 URL: https://issues.apache.org/jira/browse/GEODE-10 Project: Geode Issue Type: New Feature Components: hdfs Reporter: Dan Smith Assignee: Ashvin Attachments: GEODE-HDFSPersistence-Draft-060715-2109-21516.pdf Ability to persist data on HDFS had been under development for GemFire. It was part of the latest code drop, GEODE-8. As part of this feature we are proposing some changes to the HdfsStore management API (see attached doc for details). # The current API has nested configuration for compaction and async queue. This nested structure forces user to execute multiple steps to manage a store. It also does not seem to be consistent with other management APIs # Some member names in current API are confusing HDFS Integration: Geode as a transactional layer that microbatches data out to Hadoop. This capability makes Geode a NoSQL store that can sit on top of Hadoop and parallelize the process of moving data from the in memory tier into Hadoop, making it very useful for capturing and processing fast data while making it available for Hadoop jobs relatively quickly. The key requirements being met here are # Ingest data into HDFS parallely # Cache bloom filters and allow fast lookups of individual elements # Have programmable policies for deciding what stays in memory # Roll files in HDFS # Index data that is in memory # Have expiration policies that allows the transactional set to decay out older data # Solution needs to support replicated and partitioned regions -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (GEODE-10) HDFS Integration
[ https://issues.apache.org/jira/browse/GEODE-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630822#comment-14630822 ] ASF subversion and git services commented on GEODE-10: -- Commit 3772869d02148eec8b5ce97fbf1af9415bccd98c in incubator-geode's branch refs/heads/develop from Ashvin Agrawal [ https://git-wip-us.apache.org/repos/asf?p=incubator-geode.git;h=3772869 ] GEODE-10: Refactor HdfsStore api to match spec * Currently HdfsStore's configuration object is nested and a user needs to create multiple sub objects to manage the store instance. This is less usable and gets confusing at times. User also gets exposed to a lot of internal details. So replacing nested configuration with a flat structure will be better. * Rename members HDFS Integration Key: GEODE-10 URL: https://issues.apache.org/jira/browse/GEODE-10 Project: Geode Issue Type: New Feature Components: hdfs Reporter: Dan Smith Assignee: Ashvin Attachments: GEODE-HDFSPersistence-Draft-060715-2109-21516.pdf Ability to persist data on HDFS had been under development for GemFire. It was part of the latest code drop, GEODE-8. As part of this feature we are proposing some changes to the HdfsStore management API (see attached doc for details). # The current API has nested configuration for compaction and async queue. This nested structure forces user to execute multiple steps to manage a store. It also does not seem to be consistent with other management APIs # Some member names in current API are confusing HDFS Integration: Geode as a transactional layer that microbatches data out to Hadoop. This capability makes Geode a NoSQL store that can sit on top of Hadoop and parallelize the process of moving data from the in memory tier into Hadoop, making it very useful for capturing and processing fast data while making it available for Hadoop jobs relatively quickly. The key requirements being met here are # Ingest data into HDFS parallely # Cache bloom filters and allow fast lookups of individual elements # Have programmable policies for deciding what stays in memory # Roll files in HDFS # Index data that is in memory # Have expiration policies that allows the transactional set to decay out older data # Solution needs to support replicated and partitioned regions -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (GEODE-10) HDFS Integration
[ https://issues.apache.org/jira/browse/GEODE-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630824#comment-14630824 ] ASF subversion and git services commented on GEODE-10: -- Commit 220eb234d77de1f04585de7b82f6538f5399e1a4 in incubator-geode's branch refs/heads/develop from Ashvin Agrawal [ https://git-wip-us.apache.org/repos/asf?p=incubator-geode.git;h=220eb23 ] GEODE-10: Rename Write only method Absorb review comment, rename WriteOnlyFileSizeLimit to WriteOnlyFileRolloverSize to make it consistent with WriteOnlyFileRolloverInterval HDFS Integration Key: GEODE-10 URL: https://issues.apache.org/jira/browse/GEODE-10 Project: Geode Issue Type: New Feature Components: hdfs Reporter: Dan Smith Assignee: Ashvin Attachments: GEODE-HDFSPersistence-Draft-060715-2109-21516.pdf Ability to persist data on HDFS had been under development for GemFire. It was part of the latest code drop, GEODE-8. As part of this feature we are proposing some changes to the HdfsStore management API (see attached doc for details). # The current API has nested configuration for compaction and async queue. This nested structure forces user to execute multiple steps to manage a store. It also does not seem to be consistent with other management APIs # Some member names in current API are confusing HDFS Integration: Geode as a transactional layer that microbatches data out to Hadoop. This capability makes Geode a NoSQL store that can sit on top of Hadoop and parallelize the process of moving data from the in memory tier into Hadoop, making it very useful for capturing and processing fast data while making it available for Hadoop jobs relatively quickly. The key requirements being met here are # Ingest data into HDFS parallely # Cache bloom filters and allow fast lookups of individual elements # Have programmable policies for deciding what stays in memory # Roll files in HDFS # Index data that is in memory # Have expiration policies that allows the transactional set to decay out older data # Solution needs to support replicated and partitioned regions -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (GEODE-134) AnalyzeSerializablesJUnitTest is failing after hdfs api changes
Ashvin created GEODE-134: Summary: AnalyzeSerializablesJUnitTest is failing after hdfs api changes Key: GEODE-134 URL: https://issues.apache.org/jira/browse/GEODE-134 Project: Geode Issue Type: Bug Components: core Affects Versions: 1.0.0-incubating Reporter: Ashvin Assignee: Ashvin Fix For: 1.0.0-incubating As part of updates for GEODE-10, serialized hdfs classes got modified. The changes are causing AnalyzeSerializablesJUnitTest to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (GEODE-10) HDFS Integration
[ https://issues.apache.org/jira/browse/GEODE-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashvin resolved GEODE-10. - Resolution: Fixed HDFS Integration Key: GEODE-10 URL: https://issues.apache.org/jira/browse/GEODE-10 Project: Geode Issue Type: New Feature Components: hdfs Reporter: Dan Smith Assignee: Ashvin Attachments: GEODE-HDFSPersistence-Draft-060715-2109-21516.pdf Ability to persist data on HDFS had been under development for GemFire. It was part of the latest code drop, GEODE-8. As part of this feature we are proposing some changes to the HdfsStore management API (see attached doc for details). # The current API has nested configuration for compaction and async queue. This nested structure forces user to execute multiple steps to manage a store. It also does not seem to be consistent with other management APIs # Some member names in current API are confusing HDFS Integration: Geode as a transactional layer that microbatches data out to Hadoop. This capability makes Geode a NoSQL store that can sit on top of Hadoop and parallelize the process of moving data from the in memory tier into Hadoop, making it very useful for capturing and processing fast data while making it available for Hadoop jobs relatively quickly. The key requirements being met here are # Ingest data into HDFS parallely # Cache bloom filters and allow fast lookups of individual elements # Have programmable policies for deciding what stays in memory # Roll files in HDFS # Index data that is in memory # Have expiration policies that allows the transactional set to decay out older data # Solution needs to support replicated and partitioned regions -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (GEODE-9) Spark Integration
[ https://issues.apache.org/jira/browse/GEODE-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631646#comment-14631646 ] Jianxia Chen commented on GEODE-9: -- Once all sub-task are closed. We will merge feature/GEODE-9 to develop branch and close the issue (GEODE-9). Spark Integration - Key: GEODE-9 URL: https://issues.apache.org/jira/browse/GEODE-9 Project: Geode Issue Type: New Feature Components: core, extensions Reporter: Dan Smith Assignee: Jason Huynh Labels: asf-migration This is a feature that has been under development for GemFire but was not part of the initial drop of code for geode. Geode as a data store for Spark applications is what is being enabled here. By providing a bridge style connector for Spark applications, Geode can become the data store for storing intermediate and final state for Spark applications and allow reference data stored in the in memory tier to be accessed very efficiently for applications Expose Geode regions as Spark RDDs Write Spark RDDs to Geode Regions Execute arbitrary OQL queries in your spark applications -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (GEODE-108) HAInterestPart2DUnitTest.testInterestRecoveryFailure failed with suspect string
[ https://issues.apache.org/jira/browse/GEODE-108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631676#comment-14631676 ] ASF subversion and git services commented on GEODE-108: --- Commit a336e81c2fe7666f8e866d3d22f58adcd4a01c18 in incubator-geode's branch refs/heads/develop from [~apa...@the9muses.net] [ https://git-wip-us.apache.org/repos/asf?p=incubator-geode.git;h=a336e81 ] GEODE-108: Fix up HAInterestPart*DUnitTests Reformat, fix timeouts, reduce interval for checking asynchronous criteria. Move tearDown2 up by setUp. Null out static fields during closeCache so other dunit JVMs don't leak Cache/DS instances. Remove sleep call. Add @SuppressWarning and @Override annotations. Use addExpectedExceptions to fix suspect string failures (including the one that Jenkins hit). Fix typos. Add JUnit 4 TestCase for testing these tests together in isolation. HAInterestPart2DUnitTest.testInterestRecoveryFailure failed with suspect string --- Key: GEODE-108 URL: https://issues.apache.org/jira/browse/GEODE-108 Project: Geode Issue Type: Bug Reporter: Kirk Lund Assignee: Kirk Lund {code} com.gemstone.gemfire.internal.cache.tier.sockets.HAInterestPart2DUnitTest testInterestRecoveryFailure FAILED java.lang.AssertionError: Suspicious strings were written to the log during this run. Fix the strings or use DistributedTestCase.addExpectedException to ignore. --- Found suspect string in log4j at line 3864 java.net.SocketException: Connection reset by peer {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (GEODE-123) Increase maximum line width Eclipse formatter from 80 to 160
[ https://issues.apache.org/jira/browse/GEODE-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631678#comment-14631678 ] ASF subversion and git services commented on GEODE-123: --- Commit b1f39b609ab18751c6adca48e5a4be4f420c2743 in incubator-geode's branch refs/heads/develop from [~apa...@the9muses.net] [ https://git-wip-us.apache.org/repos/asf?p=incubator-geode.git;h=b1f39b6 ] GEODE-123: Increase maximum line width from 80 to 160. Increase maximum line width Eclipse formatter from 80 to 160 Key: GEODE-123 URL: https://issues.apache.org/jira/browse/GEODE-123 Project: Geode Issue Type: Bug Reporter: Kirk Lund Assignee: Kirk Lund Priority: Minor Proposal to change etc/eclipseFormatterProfile.xml maximum line width from 80 to 160 due to prevalence of modern monitors which have wide screens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (GEODE-134) AnalyzeSerializablesJUnitTest is failing after hdfs api changes
[ https://issues.apache.org/jira/browse/GEODE-134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631721#comment-14631721 ] ASF subversion and git services commented on GEODE-134: --- Commit a571fc6c67d1fe45a5473b312859f040805dd953 in incubator-geode's branch refs/heads/develop from Ashvin Agrawal [ https://git-wip-us.apache.org/repos/asf?p=incubator-geode.git;h=a571fc6 ] GEODE-134: Fix Serializable tests HDFS classes are new and do not break backward compatibility. AnalyzeSerializablesJUnitTest is failing after hdfs api changes --- Key: GEODE-134 URL: https://issues.apache.org/jira/browse/GEODE-134 Project: Geode Issue Type: Bug Components: core Affects Versions: 1.0.0-incubating Reporter: Ashvin Assignee: Ashvin Fix For: 1.0.0-incubating As part of updates for GEODE-10, serialized hdfs classes got modified. The changes are causing AnalyzeSerializablesJUnitTest to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (GEODE-123) Increase maximum line width Eclipse formatter from 80 to 160
[ https://issues.apache.org/jira/browse/GEODE-123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Blum updated GEODE-123: Attachment: intellijIdeaGeodeCodeStyle.xml Attaching the IntelliJ IDEA code style scheme I have been using for _Pivotal GemFire/Apache Geode_. Increase maximum line width Eclipse formatter from 80 to 160 Key: GEODE-123 URL: https://issues.apache.org/jira/browse/GEODE-123 Project: Geode Issue Type: Bug Reporter: Kirk Lund Assignee: Kirk Lund Priority: Minor Fix For: 1.0.0-incubating Attachments: intellijIdeaGeodeCodeStyle.xml Proposal to change etc/eclipseFormatterProfile.xml maximum line width from 80 to 160 due to prevalence of modern monitors which have wide screens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (GEODE-137) Spark Connector: should connect to local GemFire server if possible
Qihong Chen created GEODE-137: - Summary: Spark Connector: should connect to local GemFire server if possible Key: GEODE-137 URL: https://issues.apache.org/jira/browse/GEODE-137 Project: Geode Issue Type: Bug Reporter: Qihong Chen Assignee: Qihong Chen DefaultGemFireConnection uses ClientCacheFactory with locator info to create ClientCache instance. In this case, the ClientCache doesn't connect to the GemFire/Geode server on the same host if there's one. This cause more network traffic and less efficient. ClientCacheFactory can create ClientCache based on GemFire server(s) info as well. Therefore, we can force the ClientCache connects to local GemFire server if possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (GEODE-136) Fix possible NullPointerException in Gfsh's 'list regions' command's GetRegionsFunction.
John Blum created GEODE-136: --- Summary: Fix possible NullPointerException in Gfsh's 'list regions' command's GetRegionsFunction. Key: GEODE-136 URL: https://issues.apache.org/jira/browse/GEODE-136 Project: Geode Issue Type: Bug Components: management tools Affects Versions: 1.0.0-incubating Environment: GemFire Manager + Gfsh Reporter: John Blum The following line ([#48|https://github.com/apache/incubator-geode/blob/develop/gemfire-core/src/main/java/com/gemstone/gemfire/management/internal/cli/functions/GetRegionsFunction.java#L48]) in the {{GetRegionsFunction}} class could possibly lead to a NPE if the {{regions}} _Set_ is null since the {{regions.isEmpty()}} call proceeds the {{regions == null}} check. Of course, one should argue whether {{Cache.rootRegions()}} should be returning a null _Set_ at all rather than an empty _Set_ if there are in fact no root _Regions_ in the Geode _Cache_. But then, one could also argue that this {{GetRegionsFunction}} should not be returning a null array if there are no root _Regions_ in the Geode _Cache_ either. It too should be returning an empty array. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (GEODE-108) HAInterestPart2DUnitTest.testInterestRecoveryFailure failed with suspect string
[ https://issues.apache.org/jira/browse/GEODE-108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk Lund resolved GEODE-108. - Resolution: Fixed Fix Version/s: 1.0.0-incubating HAInterestPart2DUnitTest.testInterestRecoveryFailure failed with suspect string --- Key: GEODE-108 URL: https://issues.apache.org/jira/browse/GEODE-108 Project: Geode Issue Type: Bug Reporter: Kirk Lund Assignee: Kirk Lund Fix For: 1.0.0-incubating {code} com.gemstone.gemfire.internal.cache.tier.sockets.HAInterestPart2DUnitTest testInterestRecoveryFailure FAILED java.lang.AssertionError: Suspicious strings were written to the log during this run. Fix the strings or use DistributedTestCase.addExpectedException to ignore. --- Found suspect string in log4j at line 3864 java.net.SocketException: Connection reset by peer {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (GEODE-123) Increase maximum line width Eclipse formatter from 80 to 160
[ https://issues.apache.org/jira/browse/GEODE-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631847#comment-14631847 ] John Blum commented on GEODE-123: - IntelliJ IDEA will format your code according to the Code Style Schema configured for the project. This can be done globally for all projects, or on a project by project basis. In Preferences - Code Style - General for a particular Scheme, the dialog has a Right margin (columns): setting that determines the maximum line width. When you run the formatter (Reformat Code0 on a source file (CMD+ALT+L), it will wrap lines to the project's configured Code Style Scheme. Increase maximum line width Eclipse formatter from 80 to 160 Key: GEODE-123 URL: https://issues.apache.org/jira/browse/GEODE-123 Project: Geode Issue Type: Bug Reporter: Kirk Lund Assignee: Kirk Lund Priority: Minor Fix For: 1.0.0-incubating Proposal to change etc/eclipseFormatterProfile.xml maximum line width from 80 to 160 due to prevalence of modern monitors which have wide screens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (GEODE-120) RDD.saveToGemfire() can not handle big dataset (1M entries per partition)
[ https://issues.apache.org/jira/browse/GEODE-120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qihong Chen resolved GEODE-120. --- Resolution: Fixed RDD.saveToGemfire() can not handle big dataset (1M entries per partition) - Key: GEODE-120 URL: https://issues.apache.org/jira/browse/GEODE-120 Project: Geode Issue Type: Sub-task Components: core, extensions Affects Versions: 1.0.0-incubating Reporter: Qihong Chen Assignee: Qihong Chen Original Estimate: 48h Remaining Estimate: 48h the connector use single region.putAll() call to save each RDD partition. But putAll() doesn't handle big dataset well (such as 1M record). Need to split the dataset into smaller chunks, and invoke putAll() for each chunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (GEODE-114) There's race condition in DefaultGemFireConnection.getRegionProxy
[ https://issues.apache.org/jira/browse/GEODE-114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qihong Chen closed GEODE-114. - There's race condition in DefaultGemFireConnection.getRegionProxy - Key: GEODE-114 URL: https://issues.apache.org/jira/browse/GEODE-114 Project: Geode Issue Type: Sub-task Components: core, extensions Affects Versions: 1.0.0-incubating Reporter: Qihong Chen Assignee: Qihong Chen Fix For: 1.0.0-incubating Original Estimate: 24h Remaining Estimate: 24h when multiple threads try to call getRegionProxy with the same region at the same time, the following exception was thrown: com.gemstone.gemfire.cache.RegionExistsException: /debs at com.gemstone.gemfire.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:2880) at com.gemstone.gemfire.internal.cache.GemFireCacheImpl.basicCreateRegion(GemFireCacheImpl.java:2835) at com.gemstone.gemfire.cache.client.internal.ClientRegionFactoryImpl.create(ClientRegionFactoryImpl.java:223) at io.pivotal.gemfire.spark.connector.internal.DefaultGemFireConnection.getRegionProxy(DefaultGemFireConnection.scala:87) at io.pivotal.gemfire.spark.connector.internal.rdd.GemFirePairRDDWriter.write(GemFireRDDWriter.scala:47) at io.pivotal.gemfire.spark.connector.GemFirePairRDDFunctions$$anonfun$saveToGemfire$2.apply(GemFirePairRDDFunctions.scala:24) at io.pivotal.gemfire.spark.connector.GemFirePairRDDFunctions$$anonfun$saveToGemfire$2.apply(GemFirePairRDDFunctions.scala:24) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:64) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)