[jira] [Created] (IGNITE-14502) Ignite 3: Consider JSON-formatted toString implementations
Alexey Goncharuk created IGNITE-14502: - Summary: Ignite 3: Consider JSON-formatted toString implementations Key: IGNITE-14502 URL: https://issues.apache.org/jira/browse/IGNITE-14502 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk This is a follow-up for IGNITE-14501. Once {{GridToStringBuilder}} is ported to Ignite 3, we may change the formatting to a JSON-like structure so that any object can be beautified using standard tools. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14501) Ignite 3: Fix toString implementations
Alexey Goncharuk created IGNITE-14501: - Summary: Ignite 3: Fix toString implementations Key: IGNITE-14501 URL: https://issues.apache.org/jira/browse/IGNITE-14501 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk Fix For: 3.0.0-alpha2 There is a number of places in the codebase where autogenerated IDEA {{toString()}} implementations are used (for example, {{ModuleRegistry}}, {{NetworkMember}})) or non-conformant {{toString()}} implementations ({{Peer}} in raft-client). We need to fix the {{toString()}} implementations and move GridToStringBuilder from Ignite 2.x to avoid further similar issues. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14459) Affinity call may fail if called upon merged exchanges
Alexey Goncharuk created IGNITE-14459: - Summary: Affinity call may fail if called upon merged exchanges Key: IGNITE-14459 URL: https://issues.apache.org/jira/browse/IGNITE-14459 Project: Ignite Issue Type: Improvement Components: compute Reporter: Alexey Goncharuk When exchanges are merged, intermediate affinity assignments are not filled. At the same time, when a client chooses topology to run affinity call on, it may take a non-completed exchange version. As a result, when the affinity fetch task arrives on a node, it will look up a non-existing assignment, resulting in "Getting affinity for topology version earlier than affinity is calculated" exception. {{CacheAffinityCallSelfTest.testAffinityCallNoServerNode}} is flaky because of this bug. The following test case for {{CacheAffinityCallSelfTest}} demonstrates the issue: {code} /** * @throws Exception if failed. */ @Test public void testAffinityCallMergedExchanges() throws Exception { startGrids(SRVS); final Integer key = 1; final IgniteEx client = startClientGrid(SRVS); assertTrue(client.configuration().isClientMode()); assertNull(client.context().cache().cache(CACHE_NAME)); try { grid(0).context().cache().context().exchange().mergeExchangesTestWaitVersion( new AffinityTopologyVersion(SRVS + 3, 0), null ); IgniteInternalFuture fut1 = GridTestUtils.runAsync(() -> startGrid(SRVS + 1)); assertTrue(GridTestUtils.waitForCondition(() -> client.context().cache().context() .exchange().lastTopologyFuture() .initialVersion().equals(new AffinityTopologyVersion(SRVS + 2, 0)), 5_000)); assertFalse(fut1.isDone()); // The future should not complete until second node is started. IgniteInternalFuture fut2 = GridTestUtils.runAsync(() -> client.compute().affinityCall(CACHE_NAME, key, new CheckCallable(key, null))); startGrid(SRVS + 2); fut1.get(); fut2.get(); } finally { stopAllGrids(); } } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14393) Describe components interactions with metastorage
Alexey Goncharuk created IGNITE-14393: - Summary: Describe components interactions with metastorage Key: IGNITE-14393 URL: https://issues.apache.org/jira/browse/IGNITE-14393 Project: Ignite Issue Type: Improvement Components: documentation Affects Versions: 3.0.0-alpha1 Reporter: Alexey Goncharuk Fix For: 3.0.0-alpha2 We want to use metastorage as the golden source of the cluster state. Need to describe how components will interact with each other based on metastorage key writes and watches. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14325) SQL COPY command: when conversion fails, the error does not contain information about line number and the failed value
Alexey Goncharuk created IGNITE-14325: - Summary: SQL COPY command: when conversion fails, the error does not contain information about line number and the failed value Key: IGNITE-14325 URL: https://issues.apache.org/jira/browse/IGNITE-14325 Project: Ignite Issue Type: Improvement Affects Versions: 2.10 Reporter: Alexey Goncharuk I was trying to import data from a CSV file to Ignite cache through sqlline. When a file contains a value that cannot be converted to the schema format, the error message printed by the client is absolutely useless: {code} Error: Server error: class org.apache.ignite.internal.processors.query.IgniteSQLException: Value conversion failed [column=PICKUP_DATETIME, from=java.lang.String, to=java.sql.Timestamp] (state=5,code=1) java.sql.SQLException: Server error: class org.apache.ignite.internal.processors.query.IgniteSQLException: Value conversion failed [column=PICKUP_DATETIME, from=java.lang.String, to=java.sql.Timestamp] at org.apache.ignite.internal.jdbc.thin.JdbcThinConnection.sendRequest(JdbcThinConnection.java:1009) at org.apache.ignite.internal.jdbc.thin.JdbcThinStatement.sendFile(JdbcThinStatement.java:336) at org.apache.ignite.internal.jdbc.thin.JdbcThinStatement.execute0(JdbcThinStatement.java:243) at org.apache.ignite.internal.jdbc.thin.JdbcThinStatement.execute(JdbcThinStatement.java:560) at sqlline.Commands.executeSingleQuery(Commands.java:1054) at sqlline.Commands.execute(Commands.java:1003) at sqlline.Commands.sql(Commands.java:967) at sqlline.SqlLine.dispatch(SqlLine.java:734) at sqlline.SqlLine.begin(SqlLine.java:541) at sqlline.SqlLine.start(SqlLine.java:267) at sqlline.SqlLine.main(SqlLine.java:206) {code} The server log does not contain any helpful information as well. When input validation failed, we need to output the following context information: * Line number of the source file that triggered the error * A few values preceding the wrong column * The exact value that failed parse/conversion error * For complex types (such as date/timestamp), the acceptable input formats -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14315) Ignite 3: Use maven-flatten plugin for project pom.xml
Alexey Goncharuk created IGNITE-14315: - Summary: Ignite 3: Use maven-flatten plugin for project pom.xml Key: IGNITE-14315 URL: https://issues.apache.org/jira/browse/IGNITE-14315 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk Without the flatten it would be impossible to use the maven-published artifacts of Ignite 3. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14272) Merge modules/DEVNOTES.txt and DEVNOTES.txt
Alexey Goncharuk created IGNITE-14272: - Summary: Merge modules/DEVNOTES.txt and DEVNOTES.txt Key: IGNITE-14272 URL: https://issues.apache.org/jira/browse/IGNITE-14272 Project: Ignite Issue Type: Improvement Components: documentation Reporter: Alexey Goncharuk Assignee: Alexey Goncharuk Fix For: 3.0.0-alpha2 The modules/DEVNOTES.txt was mistakenly introduced, the contents should be moved to the root DEVNOTES.txt Also, we may add some structure to modules README.md files and link them to DEVNOTES.txt -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14112) Revisit GridClosureProcessor#runLocalSafe(Runnable, byte) usages
Alexey Goncharuk created IGNITE-14112: - Summary: Revisit GridClosureProcessor#runLocalSafe(Runnable, byte) usages Key: IGNITE-14112 URL: https://issues.apache.org/jira/browse/IGNITE-14112 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk If a simple {{Runnable}} is passed to the {{runLocalSafe}} method, not only will Ignite attempt to inject resources to the runnable, but it will also make a call to deployment, which may have various side effects. Need to walk through the code and replace {{Runnable}} with {{GridPlainRunnable}} in all places where injection is not needed/expected. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14111) Clarify how AbstractDataPageIO works
Alexey Goncharuk created IGNITE-14111: - Summary: Clarify how AbstractDataPageIO works Key: IGNITE-14111 URL: https://issues.apache.org/jira/browse/IGNITE-14111 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk There we several questions on how direct and indirect items work in the DataPageIO, the mechanics should be added to Javadoc. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14002) Formalize coding guidelines for platforms (C++, Python, Node)
Alexey Goncharuk created IGNITE-14002: - Summary: Formalize coding guidelines for platforms (C++, Python, Node) Key: IGNITE-14002 URL: https://issues.apache.org/jira/browse/IGNITE-14002 Project: Ignite Issue Type: Wish Reporter: Alexey Goncharuk Currently, we have coding guidelines for Java and .NET in place [1], [2]. It would be nice to have similar documents for other supported platforms and languages to ease the newcomers' trail in the project and maintain a common code style. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14001) Minor code style fixes for configuration and runner modules
Alexey Goncharuk created IGNITE-14001: - Summary: Minor code style fixes for configuration and runner modules Key: IGNITE-14001 URL: https://issues.apache.org/jira/browse/IGNITE-14001 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk * Some of the configuration files contain TODOs or FIXMEs without tickets - either tickets should be created, or TODOs fixed/removed. * {{Selector}}, {{DynamicProperty}}, {{ConfigurationStorage}}, {{ConfigurationValidationException}}, {{FieldValidator}} contain extra spacing before the end of class/at the beginning of the class. * Some of the overridden methods do not contain {{@Override}} annotation ({{ConfigurationProperty#value}}, anonymous {{PropertyListener}} in {{Configurator#onAttached}}) We will try to figure out automated checks for this in the future. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14000) Ignite 3.0: Fix codestyle for cli and cli-common packages
Alexey Goncharuk created IGNITE-14000: - Summary: Ignite 3.0: Fix codestyle for cli and cli-common packages Key: IGNITE-14000 URL: https://issues.apache.org/jira/browse/IGNITE-14000 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk Fix For: 3.0 Currently, the listed modules do not conform Ignite Coding Giudelines [1]: * classes are poorly formatted (e.g. braces for one-line if/for statements, invalid spacing for fields and statements blocks) * classes lack proper javadocs * packages missing package-info.java * code contains TODOs without tickets (tickets either should be created or TODOs removed/fixed) * code contains commented code blocks * Some error messages should be fixed ("Fail to..." -> "Failed to...", remove usages of "Please") -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13874) Add PMD and idea inspections to Ignite-3 build cycle
Alexey Goncharuk created IGNITE-13874: - Summary: Add PMD and idea inspections to Ignite-3 build cycle Key: IGNITE-13874 URL: https://issues.apache.org/jira/browse/IGNITE-13874 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk Since we are moving in small incremental code changes, we can impose more strict code rules and, perhaps, automate code style check completely. Let's start with PMD and Idea inspections transfer to Ignite-3 repository, with some reasonable default profiles. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13800) Provide distributed metastorage interface
Alexey Goncharuk created IGNITE-13800: - Summary: Provide distributed metastorage interface Key: IGNITE-13800 URL: https://issues.apache.org/jira/browse/IGNITE-13800 Project: Ignite Issue Type: New Feature Reporter: Alexey Goncharuk We need to crystallize the metastorage interface prototype from the IEP to understand how it will be integrated with other system components. Need to cover: * Asynchrony aspects * Possible error codes (connection failure -> unknown result vs Raft failure -> known result, etc) * Complex multi-updates (aka transactions) * Watchers. Each node can watch all updates and filter locally or adjust the watched ranges dynamically (consistency is important here) These interfaces are considered "client" interfaces as they will be available on all nodes in the cluster -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13799) Provide a required storage interface for metastorage and partitions for replication protocol
Alexey Goncharuk created IGNITE-13799: - Summary: Provide a required storage interface for metastorage and partitions for replication protocol Key: IGNITE-13799 URL: https://issues.apache.org/jira/browse/IGNITE-13799 Project: Ignite Issue Type: New Feature Reporter: Alexey Goncharuk We need to identify two storage interfaces that will be interacting with the replication protocol: * Distributed metastorage persistent state machine * Partition persistent state machine The interfaces for the said storages most likely will be quite different, but still will have some common ground. Need to define them so that we can start moving the page memory infrastructure to Ignite-3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13798) Prototype Raft implementation port to a separate zero-dependency Ignite module
Alexey Goncharuk created IGNITE-13798: - Summary: Prototype Raft implementation port to a separate zero-dependency Ignite module Key: IGNITE-13798 URL: https://issues.apache.org/jira/browse/IGNITE-13798 Project: Ignite Issue Type: New Feature Reporter: Alexey Goncharuk Assignee: Alexey Goncharuk We need to check whether it is reasonable and feasible to port the etcd Raft implementation [1] to Java, maintaining the same API interaction model: * Raft instance is a single-threaded state machine with methods to accept messages, return progress to be processed by a raft client, and tick callback * Raft instance does not actively send messages, not does it actively write to persistent log or the state machine The implementation should demonstrate how the module will be used with omitted components: Raft Log, State Machine, Messaging, Timer. The implementation must cover: * Ability to provide leader/follower callbacks * Ability to read linearizable and relaxed commit indexes The implementation may cover: * Replication group reconfiguration The implementation prototype does not cover: * Multi-raft groups * Asynchronous state machine mutation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13753) Non-thread-safe collection is used for the list of registered MBeans in JmxMetricExporterSpi
Alexey Goncharuk created IGNITE-13753: - Summary: Non-thread-safe collection is used for the list of registered MBeans in JmxMetricExporterSpi Key: IGNITE-13753 URL: https://issues.apache.org/jira/browse/IGNITE-13753 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk {{MetricManager}} registry creation and remove listeners can be invoked concurrently (the only synchronization is via {{map.computeIfAbsent}} which provides key-level granularity. As a result, some of the beans are lost and I get an occasional assertion on {code} boolean rmv = mBeans.remove(mbeanName); assert rmv; {code} Changing the collection to a synchronized list should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13745) Add release notes for streaming extensions release 1.0.0
Alexey Goncharuk created IGNITE-13745: - Summary: Add release notes for streaming extensions release 1.0.0 Key: IGNITE-13745 URL: https://issues.apache.org/jira/browse/IGNITE-13745 Project: Ignite Issue Type: Sub-task Reporter: Alexey Goncharuk -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13693) Add flags field with tombstone support, update schema size to 2 bytes
Alexey Goncharuk created IGNITE-13693: - Summary: Add flags field with tombstone support, update schema size to 2 bytes Key: IGNITE-13693 URL: https://issues.apache.org/jira/browse/IGNITE-13693 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13670) Omit nullmap where possible
Alexey Goncharuk created IGNITE-13670: - Summary: Omit nullmap where possible Key: IGNITE-13670 URL: https://issues.apache.org/jira/browse/IGNITE-13670 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk The nullmap is currently always written to the tuple layout. However, it can be fully omitted for schemas where all columns are non-null - this saves both a little space and runtime for offsets folding. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13669) Implement date native types
Alexey Goncharuk created IGNITE-13669: - Summary: Implement date native types Key: IGNITE-13669 URL: https://issues.apache.org/jira/browse/IGNITE-13669 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk Besides the types themselves, it may be beneficial to provide date/time field extraction methods so that they can be read without object creation. The layout is described in the IEP. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13668) Implement Number(n) and Decimal native types
Alexey Goncharuk created IGNITE-13668: - Summary: Implement Number(n) and Decimal native types Key: IGNITE-13668 URL: https://issues.apache.org/jira/browse/IGNITE-13668 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk Number(n) is an {{n}}-bytes two-complement integer signed value encoded in the varlong style (so that Number(4) can be mapped to integer and Number(8) can be mapped to long during (de)serialization). Larger numbers can be represented as {{BigInteger}}. The Number(n) is a varlen type, so it will take two additional bytes in the varlen table, so types smaller than Number(2) are better represented by {{byte}} and {{short}} types as their fixlen encoding takes exactly 1 and 2 bytes respectively. Decimal is a direct mapping to BigDecimal value. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13667) Add schema columns relocation table to map from user order to system order
Alexey Goncharuk created IGNITE-13667: - Summary: Add schema columns relocation table to map from user order to system order Key: IGNITE-13667 URL: https://issues.apache.org/jira/browse/IGNITE-13667 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk When a schema is defined, the key chunk columns and value chunk columns are sorted so that fixlen columns go first and varlen columns go second, so the sorted column order differs from the order of the user-defined columns. We need to add a simple relocation table which is a permutation of indices {{[0..n)}}, so that an internal column order for user index {{n}} is {{relocationTbl[n]}}. NB: the tuple assembler will still need to access the internal sorted order for proper tuple assembly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13634) Ignite-extensions: update dependencies to use provided scope
Alexey Goncharuk created IGNITE-13634: - Summary: Ignite-extensions: update dependencies to use provided scope Key: IGNITE-13634 URL: https://issues.apache.org/jira/browse/IGNITE-13634 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk Ignite extensions should be version-agnostic, therefore extension dependencies should be declared with {{provided}} scope. This is currently already done correctly for spring-boot-autoconfigure and spring-boot-thin-client-autoconfigure, so need to update other dependencies as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13616) IEP-54 Live schema for tables
Alexey Goncharuk created IGNITE-13616: - Summary: IEP-54 Live schema for tables Key: IGNITE-13616 URL: https://issues.apache.org/jira/browse/IGNITE-13616 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk Fix For: 3.0 Umbrella ticket for [IEP-54|https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13570) Migrate OSGI module to ignite-extensions
Alexey Goncharuk created IGNITE-13570: - Summary: Migrate OSGI module to ignite-extensions Key: IGNITE-13570 URL: https://issues.apache.org/jira/browse/IGNITE-13570 Project: Ignite Issue Type: Sub-task Reporter: Alexey Goncharuk Migrate OSGI module to ignite-extensions https://github.com/apache/ignite-extensions Details: https://cwiki.apache.org/confluence/display/IGNITE/IEP-36%3A+Modularization#IEP-36:Modularization-IndependentIntegrations -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13241) Get rid of Externalizable implementation for GridCacheAdapter
Alexey Goncharuk created IGNITE-13241: - Summary: Get rid of Externalizable implementation for GridCacheAdapter Key: IGNITE-13241 URL: https://issues.apache.org/jira/browse/IGNITE-13241 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk The cache implementation must not be serialized. For convenience, only user cache proxies support serialization. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13220) Cleanup IgniteCacheOffheapManager interface
Alexey Goncharuk created IGNITE-13220: - Summary: Cleanup IgniteCacheOffheapManager interface Key: IGNITE-13220 URL: https://issues.apache.org/jira/browse/IGNITE-13220 Project: Ignite Issue Type: Improvement Components: persistence Reporter: Alexey Goncharuk Currently, the {{IgniteCacheOffheapManager}} interface is too verbose and there are a few methods that leak implementation specifics: * Easy * {{void onPartitionCounterUpdated(int part, long cntr)}} is not used * {{long lastUpdatedPartitionCounter(int part)}} is not used * {{boolean containsKey(GridCacheMapEntry entry)}} is not used * {{int onUndeploy(ClassLoader ldr)}} is no-op and should be removed as caches do not participate in peer class loading * {{int offheapAllocatedSize}} always returns 0 and looks like it should be removed as there are corresponding data region metrics * Moderate * A number of methods in {{CacheDataStore}} are semantically equivalent to the ones {{IgniteCacheOffheapManager}} itself: {{update}}, {{invoke}}, {{remove}}, the {{mvcc*}} methods. Looks like the duplicates from {{IgniteCacheOffheapManager}} may be removed altogether * There is a number of {{iterator}} methods in the interface. Most likely, we can remove the ones that convert cache data rows to {{CacheEntry}} objects and the ones that join multiple partition iterators together. * {{void clearCache(GridCacheContext cctx, boolean readers)}} does not belong to the interface as it can be implemented outside of the manager * Complex * {{globalRemoveId}} leaks the B+Tree implementation specifics and should be removed from the interface * {{rootPageForIndex}} and the corresponding destroy method leaks the page memory abstraction. Index management should be abstracted for any type of storage. * There is an obvious duplication of MVCC and non-MVCC methods in the interface. For each cache group, and therefore, for each instance of {{CacheDataStore}} only one of the sets will be used. We should have either a single interface for both, or different interfaces, but not two sets of methods in one interface. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13195) Allow skipping autotools invocation when building Ignite release
Alexey Goncharuk created IGNITE-13195: - Summary: Allow skipping autotools invocation when building Ignite release Key: IGNITE-13195 URL: https://issues.apache.org/jira/browse/IGNITE-13195 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk I do not have an up-to-date set of autotools installed on local machine, and Apache Ignite release build fails locally with the following error: {code} main::scan_file() called too early to check prototype at /usr/local/bin/aclocal line 617. configure.ac:36: warning: macro `AM_PROG_AR' not found in library configure.ac:21: error: Autoconf version 2.69 or higher is required configure.ac:21: the top level autom4te: /usr/bin/m4 failed with exit status: 63 aclocal: autom4te failed with exit status: 63 {code} I do not need to run these commands locally because I only need a quick assembly (java only, even no javadocs) to verify the release structure and command-line utilities integrity. It would be great to move these commands to a separate profile (enabled by default) so users can skip them when building the release package, something like {{mvn initialize -Prelease -P!autotools}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13101) Metastore may leave uncompleted write futures during node stop
Alexey Goncharuk created IGNITE-13101: - Summary: Metastore may leave uncompleted write futures during node stop Key: IGNITE-13101 URL: https://issues.apache.org/jira/browse/IGNITE-13101 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk I've got the following thread-dump (only relevant parts are retained) during one of the teamcity runs: {code} "sys-#103862%baseline.IgniteStableBaselineBinObjFieldsQuerySelfTest0%" #107048 prio=5 os_prio=0 tid=0x7fa2d8009800 nid=0x480d waiting on condition [0x7fa1d1cdc000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141) at org.apache.ignite.internal.processors.metric.GridMetricManager.remove(GridMetricManager.java:411) at org.apache.ignite.internal.processors.cache.CacheGroupMetricsImpl.remove(CacheGroupMetricsImpl.java:497) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.cleanup(GridCacheProcessor.java:512) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.stopCacheGroup(GridCacheProcessor.java:2901) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.stopCacheGroup(GridCacheProcessor.java:2889) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.processCacheStopRequestOnExchangeDone(GridCacheProcessor.java:2781) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.onExchangeDone(GridCacheProcessor.java:2878) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onDone(GridDhtPartitionsExchangeFuture.java:2431) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.finishExchangeOnCoordinator(GridDhtPartitionsExchangeFuture.java:3832) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onAllReceived(GridDhtPartitionsExchangeFuture.java:3608) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.processSingleMessage(GridDhtPartitionsExchangeFuture.java:3207) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.access$200(GridDhtPartitionsExchangeFuture.java:154) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$2.apply(GridDhtPartitionsExchangeFuture.java:2994) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$2.apply(GridDhtPartitionsExchangeFuture.java:2982) at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399) at org.apache.ignite.internal.util.future.GridFutureAdapter.listen(GridFutureAdapter.java:354) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onReceiveSingleMessage(GridDhtPartitionsExchangeFuture.java:2982) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.processSinglePartitionUpdate(GridCachePartitionExchangeManager.java:1989) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.preprocessSingleMessage(GridCachePartitionExchangeManager.java:524) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.access$1100(GridCachePartitionExchangeManager.java:182) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$2.onMessage(GridCachePartitionExchangeManager.java:407) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$2.onMessage(GridCachePartitionExchangeManager.java:389) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$MessageHandler.apply(GridCachePartitionExchangeManager.java:3715) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$MessageHandler.apply(GridCachePartitionExchangeManager.java:3694) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1142) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:591) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:392) at
[jira] [Created] (IGNITE-13100) ClassCastException in cache group metrics on client nodes
Alexey Goncharuk created IGNITE-13100: - Summary: ClassCastException in cache group metrics on client nodes Key: IGNITE-13100 URL: https://issues.apache.org/jira/browse/IGNITE-13100 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk The following exception can be observed when reading cache group metrics on client nodes with persistence-enabled config: {code} java.lang.ClassCastException: org.apache.ignite.internal.processors.cache.persistence.IgniteCacheDatabaseSharedManager cannot be cast to org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager at org.apache.ignite.internal.processors.cache.CacheGroupMetricsImpl.database(CacheGroupMetricsImpl.java:506) at org.apache.ignite.internal.processors.cache.CacheGroupMetricsImpl.lambda$new$0(CacheGroupMetricsImpl.java:103) at org.apache.ignite.internal.util.lang.GridFunc.lambda$nonThrowableSupplier$3(GridFunc.java:3341) at org.apache.ignite.internal.processors.metric.impl.LongGauge.value(LongGauge.java:45) at org.apache.ignite.spi.metric.LongMetric.getAsString(LongMetric.java:29) {code} The reason is an incomplete check for persistence enabled in {{CacheGroupMetricsImpl}}: we should also check for client nodes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12990) Remove static thread group from IgniteSpiThread
Alexey Goncharuk created IGNITE-12990: - Summary: Remove static thread group from IgniteSpiThread Key: IGNITE-12990 URL: https://issues.apache.org/jira/browse/IGNITE-12990 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk This is a follow-up ticket for IGNITE-12554. The static thread group has not been removed from IgniteSpiThread which still prevents node start after the thread group destroy. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12802) Move checkpoint state fields to CheckpointProgress
Alexey Goncharuk created IGNITE-12802: - Summary: Move checkpoint state fields to CheckpointProgress Key: IGNITE-12802 URL: https://issues.apache.org/jira/browse/IGNITE-12802 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk This is a review follow-up for IGNITE-7792. I've noticed that quite a few fields in {{GridCacheDatabaseSharedManager}} are related to the state of current checkpoint: {code} writtenPagesCntr syncedPagesCntr evictedPagesCntr currCheckpointPagesCnt {code} After checkpoint is completed, these fields are reset. On the other hand, we have a separate class to track the state of current checkpoint: {{CheckpointProgressImpl}}. I believe it makes sense to move these fields to the separate class. Perhaps, it also makes sense to make this class a top-level class. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12739) Optimistic serializable transactions may fail infinitely when read-through is enabled
Alexey Goncharuk created IGNITE-12739: - Summary: Optimistic serializable transactions may fail infinitely when read-through is enabled Key: IGNITE-12739 URL: https://issues.apache.org/jira/browse/IGNITE-12739 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk In current design it is possible that the same key-value pair will be stored with different versions on primary and backup nodes. For example, a read-through is invoked separately on primary backup and values are stored with node local version. With this precondition, if an optimistic serializable transaction is started from a backup node, the serializable check version is read from backup, but validated on primary node, which will fail the transaction with optimistic read/write conflict exception until the versions are overwritten to the same value (for example, via a pessimistic transaction). While we need to additionally investigate whether we want to change the read-through logic to ensure the same value and version on all nodes, this particular scenario should be fixed by always enforcing reading from a primary node inside an optimistic serializable transaction. The reproducer is attached. A known workaround is to disable read load balancing by setting "-DIGNITE_READ_LOAD_BALANCING=false" system property. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12681) IgniteShutdownOnSupplyMessageFailureTest
Alexey Goncharuk created IGNITE-12681: - Summary: IgniteShutdownOnSupplyMessageFailureTest Key: IGNITE-12681 URL: https://issues.apache.org/jira/browse/IGNITE-12681 Project: Ignite Issue Type: Test Reporter: Alexey Goncharuk The test checks that a node will be shut down by a failure handler by listening for NODE_LEFT event. However, if the node shutdown happens before a new node joins the cluster, the joining node will form a cluster by itself with topology version = 1 and no event will be fired. The test should be changed to specifically listen for the failure handler invocation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12575) Document @IgniteExperimental annotation
Alexey Goncharuk created IGNITE-12575: - Summary: Document @IgniteExperimental annotation Key: IGNITE-12575 URL: https://issues.apache.org/jira/browse/IGNITE-12575 Project: Ignite Issue Type: Task Components: documentation Affects Versions: 2.8 Reporter: Alexey Goncharuk Fix For: 2.8 We introduced the annotation to mark APIs which are exposed to users to try out new features, but the APIs are likely to evolve in the future. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12559) Introduce @IgniteExperimental API annotation
Alexey Goncharuk created IGNITE-12559: - Summary: Introduce @IgniteExperimental API annotation Key: IGNITE-12559 URL: https://issues.apache.org/jira/browse/IGNITE-12559 Project: Ignite Issue Type: Improvement Affects Versions: 2.8 Reporter: Alexey Goncharuk Assignee: Alexey Goncharuk Fix For: 2.8 This is a follow-up for the discussion on the dev-list: http://apache-ignite-developers.2346864.n4.nabble.com/Internal-classes-are-exposed-in-public-API-td45146.html The annotation will mark new APIs which are not yet finalized, but the underlying functionality is ready to be tried by users. The annotation will provide a good way to collect user feedback and correct wrong design choices without a need to preserve binary and source compatibility. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12550) Add page read latency histogram per data region
Alexey Goncharuk created IGNITE-12550: - Summary: Add page read latency histogram per data region Key: IGNITE-12550 URL: https://issues.apache.org/jira/browse/IGNITE-12550 Project: Ignite Issue Type: Improvement Components: persistence Reporter: Alexey Goncharuk Assignee: Alexey Goncharuk Fix For: 2.9 During an incident I experienced a large checkpoint mark duration. It was impossible to determine whether this was caused by a stalled disk because of large number of long page reads or by some other reasons. Having a metric showing the page read latency histogram would help in such cases. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12490) Service proxy throws "Service not found" exception right after deploy
Alexey Goncharuk created IGNITE-12490: - Summary: Service proxy throws "Service not found" exception right after deploy Key: IGNITE-12490 URL: https://issues.apache.org/jira/browse/IGNITE-12490 Project: Ignite Issue Type: Improvement Components: managed services Affects Versions: 2.8 Reporter: Alexey Goncharuk Attachments: ServiceInvokeTest.java In the following scenario: * Start nodes A, B * Deploy a service on A * Create a service proxy on B * Invoke the proxy The proxy invocation throws a service not found exception. As per discussion [on the dev list|http://apache-ignite-developers.2346864.n4.nabble.com/Discovery-based-services-deployment-guarantees-question-td44866.html] this case should be handled by an automatic retry, however, it's not. The reproducer is attached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12438) Extend communication protocol to establish client-server connection
Alexey Goncharuk created IGNITE-12438: - Summary: Extend communication protocol to establish client-server connection Key: IGNITE-12438 URL: https://issues.apache.org/jira/browse/IGNITE-12438 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk Recently there was quite a lot of questions related to thick clients connectivity issues when the clients are deployed in a k8s pod [1]. The general issue here is clients reporting network address which are not reachable from server nodes. At the same time, the clients can connect to server nodes. An idea of how to fix this is as follows: * Make sure that think clients discovery SPI always maintains a connection to a server node (this should be already implemented) * (Optionally) detect when a client has only one-way connectivity with the server nodes. This part should be investigated. We need this to avoid server nodes attempt to connect to a client and send communication request to the client node faster * When a server attempts to establish a connection with a client, check if client is unreachable or the previous connection attempt failed. If so, send a discovery message to the client to force a client-server connection. In this case, server will be able to send the original message via the newly established connection. [1] https://stackoverflow.com/questions/59192075/ignite-communicationspi-questions-in-paas-environment/59232504 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12430) Move PagePool to a separate class
Alexey Goncharuk created IGNITE-12430: - Summary: Move PagePool to a separate class Key: IGNITE-12430 URL: https://issues.apache.org/jira/browse/IGNITE-12430 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk Assignee: Alexey Goncharuk This is a refactoring required for IGNITE-12412 in order to create a separate test. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12412) Incomplete check for ABA problem in PageMemoryImpl#PagePool
Alexey Goncharuk created IGNITE-12412: - Summary: Incomplete check for ABA problem in PageMemoryImpl#PagePool Key: IGNITE-12412 URL: https://issues.apache.org/jira/browse/IGNITE-12412 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk Assignee: Alexey Goncharuk In current implementation, {{PagePool#releasePage}} clears the counter part of the returned page ID, which effectively disables the ABA check intended in the class. This issue can be rarely reproduced on zOS during checkpoints (when pages are being taken and returned to the checkpoint pages pool). I managed to write a unit-test to reproduce this issue on x86. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12263) Introduce native persistence compaction operation
Alexey Goncharuk created IGNITE-12263: - Summary: Introduce native persistence compaction operation Key: IGNITE-12263 URL: https://issues.apache.org/jira/browse/IGNITE-12263 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk Currently, Ignite native persistence does not shrink storage files after key-value pairs are removed. The causes of this behavior are: * The absence of a mechanism that allows Ignite to track highest non-empty page position in a partition file * The absence of a mechanism which allows Ignite to select a page closest to the file beginning for write * The absence of a mechanism which allows Ignite to move a key-value pair from page to page during defragmentation As an initial change I suggest to introduce a new node startup mode, which will run a defragmentation procedure allowing the node to shrink storage files. The procedure will not mutate the logical state of a partition allowing further historical rebalance to quickly catch up the node. Since the procedure will run during the node startup (during the final stages of recovery), there will be no concurrent load, thus the entries can be freely moved from page to page with no tricky synchronization. If a procedure is applied during the whole cluster restart, then all nodes will be defragmented simultaneously, allowing for a quicker parallel defragmentation at a cost of downtime. The procedure should accept an optional list of cache groups to defragment to allow arbitrary cache group selection for defragmentation. An idea of the actions taken during the run for each partition selected for defragmentation: * Partition pages are preloaded to memory if possible to avoid excessive page replacement. During the scan, a HWM of the written data is detected (empty pages are skipped) * Pages references in a free list are sorted in a way allowing to pick pages closest to the file start * The partition is scanned in reverse order, key-value pairs are moved closer to the file start, HWM is updated accordingly. This step is particularly open for various optimizations because different strategies will work well for different fragmentation patterns. * After the scan iteration is completed, the file size can be updated according to the HWM As a further improvement, this partition defragmentation procedure can be later run in online mode, after proper cache update protocol changes are designed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12150) Ignite javadoc contains bogus links, copyright is outdated
Alexey Goncharuk created IGNITE-12150: - Summary: Ignite javadoc contains bogus links, copyright is outdated Key: IGNITE-12150 URL: https://issues.apache.org/jira/browse/IGNITE-12150 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk Assignee: Alexey Goncharuk Ignite Javadoc contains links to {{http://agorbatchev.typepad.com}} which may be broken and unnecessary. Also, the copyright year is hardcoded and outdated. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (IGNITE-12113) control.sh terminates silently when JAVA_HOME is not set
Alexey Goncharuk created IGNITE-12113: - Summary: control.sh terminates silently when JAVA_HOME is not set Key: IGNITE-12113 URL: https://issues.apache.org/jira/browse/IGNITE-12113 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk Running control.sh from ignite-2.7.6 release candidate on MacOS with empty JAVA_HOME produces no output - the script terminates without any action. The reason is the following line in {{bin/control.sh}}: {code} javaMajorVersion "${JAVA_HOME}/bin/java" {code} Since {{JAVA_HOME}} is empty, the argument passed to the function is invalid and the function terminates the script. I suggest replacing the {{${JAVA_HOME}/bin/java}} with just {{$JAVA}} because it is already determined earlier in the scope. The suggested fix works in my environment for all options of {{JAVA_HOME}} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (IGNITE-12019) [TC Bot] PR screen resets the base branch on page refresh
Alexey Goncharuk created IGNITE-12019: - Summary: [TC Bot] PR screen resets the base branch on page refresh Key: IGNITE-12019 URL: https://issues.apache.org/jira/browse/IGNITE-12019 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk Even though 'baseBranchForTc' is already supported by the page, it is not picked up because the initial value is '' and the URL is not updated when select is updated. I suggest to update the URL on select change using history object. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (IGNITE-11930) TcpDiscoverySpi does not close bound server socket if discovery thread did not start
Alexey Goncharuk created IGNITE-11930: - Summary: TcpDiscoverySpi does not close bound server socket if discovery thread did not start Key: IGNITE-11930 URL: https://issues.apache.org/jira/browse/IGNITE-11930 Project: Ignite Issue Type: Bug Affects Versions: 2.7 Reporter: Alexey Goncharuk Assignee: Alexey Goncharuk Fix For: 2.8 See {{ServerImpl.spiStop0(boolean)}}. If the worker did not start, {{U.cancel()}} has no effect because runner field is not initialized, and server socket is closed in {{onInterrupted()}} method, which is called from the {{interrupt()}} method on the worker thread. This results in the server socket not being closed and may lead to tests hang, for example, in .NET tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11750) Implement locked pages info for long-running B+Tree operations
Alexey Goncharuk created IGNITE-11750: - Summary: Implement locked pages info for long-running B+Tree operations Key: IGNITE-11750 URL: https://issues.apache.org/jira/browse/IGNITE-11750 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk I've stumbled upon an incident where a batch of Ignite threads were hanging on BPlusTree operations trying to acquire read or write lock on pages. From the thread dump it is impossible to check if there is an issue with {{OffheapReadWriteLock}} or there is a subtle deadlock in the tree. I suggest we implement a timeout for page lock acquire and tracking of locked pages. This should be relatively easy to implement in {{PageHandler}} (the only thing to consider is performance degradation). If a timeout occurs, we should print all the locks currently owned by a thread. This way we should be able to determine if there is a deadlock in the {{BPlusTree}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11749) Implement automatic pages history dump on CorruptedTreeException
Alexey Goncharuk created IGNITE-11749: - Summary: Implement automatic pages history dump on CorruptedTreeException Key: IGNITE-11749 URL: https://issues.apache.org/jira/browse/IGNITE-11749 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk Currently, the only way to debug possible bugs in checkpointer/recovery mechanics is to manually parse WAL files after the corruption happened. This is not practical for several reasons. First, it requires manual actions which depend on the content of the exception. Second, it is not always possible to obtain WAL files (it may contain sensitive data). We need to add a mechanics which will dump all information required for primary analysis of the corruption to the exception handler. For example, if an exception happened when materializing a link {{0xabcd}} written on an index page {{0xdcba}}, we need to dump history of both pages changes, checkpoint records on the analysis interval. Possibly, we should include FreeList pages to which the aforementioned pages were included to. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11732) Multi-merged partitions exchange future may hang if node left event is received last
Alexey Goncharuk created IGNITE-11732: - Summary: Multi-merged partitions exchange future may hang if node left event is received last Key: IGNITE-11732 URL: https://issues.apache.org/jira/browse/IGNITE-11732 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11725) Document IGNITE_DISCOVERY_DISABLE_CACHE_METRICS_UPDATE system property
Alexey Goncharuk created IGNITE-11725: - Summary: Document IGNITE_DISCOVERY_DISABLE_CACHE_METRICS_UPDATE system property Key: IGNITE-11725 URL: https://issues.apache.org/jira/browse/IGNITE-11725 Project: Ignite Issue Type: Task Reporter: Alexey Goncharuk In IGNITE-10172 we added an option to disable cache metrics sending via discovery messages. This property should be properly documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11707) Tcp Discovery should drop pending metrics update message when new message is received
Alexey Goncharuk created IGNITE-11707: - Summary: Tcp Discovery should drop pending metrics update message when new message is received Key: IGNITE-11707 URL: https://issues.apache.org/jira/browse/IGNITE-11707 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11704) Write tombstones during rebalance to get rid of deferred delete buffer
Alexey Goncharuk created IGNITE-11704: - Summary: Write tombstones during rebalance to get rid of deferred delete buffer Key: IGNITE-11704 URL: https://issues.apache.org/jira/browse/IGNITE-11704 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk Currently Ignite relies on deferred delete buffer in order to handle write-remove conflicts during rebalance. Given the limit size of the buffer, this approach is fundamentally flawed, especially in case when persistence is enabled. I suggest to extend the logic of data storage to be able to store key tombstones - to keep version for deleted entries. The tombstones will be stored when rebalance is in progress and should be cleaned up when rebalance is completed. Later this approach may be used to implement fast partition rebalance based on merkle trees (in this case, tombstones should be written on an incomplete baseline). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11700) Document disabled WAL during rebalance
Alexey Goncharuk created IGNITE-11700: - Summary: Document disabled WAL during rebalance Key: IGNITE-11700 URL: https://issues.apache.org/jira/browse/IGNITE-11700 Project: Ignite Issue Type: Task Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11691) IgniteWalSerializerVersionTest fails in master with NoSuchElementException
Alexey Goncharuk created IGNITE-11691: - Summary: IgniteWalSerializerVersionTest fails in master with NoSuchElementException Key: IGNITE-11691 URL: https://issues.apache.org/jira/browse/IGNITE-11691 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=2723006680195182103=%3Cdefault%3E=testDetails The issue is caused by an incorrect test: test iterator should pass only instances of TimeStamp records. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11687) Concurrent WAL replay & log may fail with CRC error on read
Alexey Goncharuk created IGNITE-11687: - Summary: Concurrent WAL replay & log may fail with CRC error on read Key: IGNITE-11687 URL: https://issues.apache.org/jira/browse/IGNITE-11687 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk The cause is the way {{end}} is calculated for WAL iterator: {code} if (hnd != null) end = hnd.position(); {code} {code} @Override public FileWALPointer position() { lock.lock(); try { return new FileWALPointer(getSegmentId(), (int)written, 0); } finally { lock.unlock(); } } {code} Consider a partially written entry. In this case, {{written}} has been already updated, concurrent WAL replay will attempt to read the incompletely written record and since {{end}} is not null, iterator will fail with CRC error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11676) Clean up custom event callbacks
Alexey Goncharuk created IGNITE-11676: - Summary: Clean up custom event callbacks Key: IGNITE-11676 URL: https://issues.apache.org/jira/browse/IGNITE-11676 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk Currently, {{GridDiscoveryManager}} has several ways of notifying Ignite components of discovery events: * Line 668: a set of {{instanceof}} statements to invoke specific callbacks for components (for example, {{ctx.state().onStateChangeMessage(...)}}, {{ctx.cache().onCustomEvent(...)}} * Later, on line 715: we call a somewhat generic custom event listeners * Finally, on line 876, if the custom message was of a specific type, we fire EVT_DISCOVERY_CUSTOM_EVT Overall, this is a huge abstraction leak, and all non-discovery specifics should be eliminated from {{GridDiscoveryManager}}. I suggest the following: 1) Change {{CustomEventListener}} to have two methods: one of them should return {{true}} or {{false}} to determine whether the minor topology version should be incremented 2) Move all logic to corresponding components 3) Get rid of code on line 876, I see no need in this. Also, consider removing {{EVT_DISCOVERY_CUSTOM_EVT}}, as this is private API and should now be covered by the specific listener -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11644) Get rid of old exchange protocol
Alexey Goncharuk created IGNITE-11644: - Summary: Get rid of old exchange protocol Key: IGNITE-11644 URL: https://issues.apache.org/jira/browse/IGNITE-11644 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk Old (non-merging exchange protocol) is not used anymore and should be removed from the code to clean it up. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11633) Fix errors in WAL disabled archive mode documentation
Alexey Goncharuk created IGNITE-11633: - Summary: Fix errors in WAL disabled archive mode documentation Key: IGNITE-11633 URL: https://issues.apache.org/jira/browse/IGNITE-11633 Project: Ignite Issue Type: Task Components: documentation Reporter: Alexey Goncharuk In https://apacheignite.readme.io/docs/write-ahead-log#section-disabling-wal-archiving there is an error. The documentation says that " instead, it will overwrite the active segments in a cyclical order". In fact, when walWork == walArchive, the whole folder behaves as a sequential log, where new files are sequentially created (0, 1, 2, 3, ...) and old files are eventually truncated. Also, need to clarify the wal size setting in this mode. Ask [~dpavlov] and [~akalashnikov] for details. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11626) InitNewCoordinatorFuture should be reported in diagnostic output
Alexey Goncharuk created IGNITE-11626: - Summary: InitNewCoordinatorFuture should be reported in diagnostic output Key: IGNITE-11626 URL: https://issues.apache.org/jira/browse/IGNITE-11626 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk Currently {{InitNewCoordinatorFuture}} is not printed in PME diagnostic output. This future also does not implement diagnostic aware interface and remote information is not collected for this future. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11622) Investigate how CacheEntryRemovedException can be thrown in onEntriesLocked
Alexey Goncharuk created IGNITE-11622: - Summary: Investigate how CacheEntryRemovedException can be thrown in onEntriesLocked Key: IGNITE-11622 URL: https://issues.apache.org/jira/browse/IGNITE-11622 Project: Ignite Issue Type: Task Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11618) Assertion got removed exception on entry with dht local candidate on transaction timeout
Alexey Goncharuk created IGNITE-11618: - Summary: Assertion got removed exception on entry with dht local candidate on transaction timeout Key: IGNITE-11618 URL: https://issues.apache.org/jira/browse/IGNITE-11618 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11617) New exchange coordinator skips client fast reply for previous exchange
Alexey Goncharuk created IGNITE-11617: - Summary: New exchange coordinator skips client fast reply for previous exchange Key: IGNITE-11617 URL: https://issues.apache.org/jira/browse/IGNITE-11617 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11616) NPE in MvccProcessorImpl when stopping a starting node
Alexey Goncharuk created IGNITE-11616: - Summary: NPE in MvccProcessorImpl when stopping a starting node Key: IGNITE-11616 URL: https://issues.apache.org/jira/browse/IGNITE-11616 Project: Ignite Issue Type: Test Components: sql Reporter: Alexey Goncharuk Fix For: 2.8 I observe the following NPE in IgniteBaselineAffinityTopologyActivationTest. It happens because we shutdown when MVCC coordinator is not assigned yet {code} java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.replaceNode(ConcurrentHashMap.java:1106) at java.util.concurrent.ConcurrentHashMap.remove(ConcurrentHashMap.java:1097) at org.apache.ignite.internal.processors.cache.mvcc.MvccProcessorImpl.onCoordinatorFailed(MvccProcessorImpl.java:527) at org.apache.ignite.internal.processors.cache.mvcc.MvccProcessorImpl.onKernalStop(MvccProcessorImpl.java:459) at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2335) at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2283) at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1194) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1992) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1683) at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1109) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:607) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:984) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:925) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:913) at org.apache.ignite.internal.processors.cache.persistence.IgniteBaselineAffinityTopologyActivationTest.startGridWithConsistentId(IgniteBaselineAffinityTopologyActivationTest.java:729) at org.apache.ignite.internal.processors.cache.persistence.IgniteBaselineAffinityTopologyActivationTest.testNodeWithBltIsNotAllowedToJoinClusterDuringFirstActivation(IgniteBaselineAffinityTopologyActivationTest.java:532) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.apache.ignite.testframework.junits.GridAbstractTest$6.run(GridAbstractTest.java:2102) at java.lang.Thread.run(Thread.java:745) {code} >From the first glance it looks like we can simply ignore the {{null}} node ID, >however, there is a race - in {{onKernalStop}} we block a busy lock and remove >discovery listener, then do a coordinator cleanup. However, the discovery >notification worker is only stopped in {{stop}} phase, but MVCC manager does a >cleanup in {{onKernalStop}} phase - so listener can execute some code after >the {{onKernalStop}} is executed because there is no busy lock protection in >the discovery listener itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11615) IgniteBaselineAffinityTopologyActivationTest sometimes fails due to NPE on node stop
Alexey Goncharuk created IGNITE-11615: - Summary: IgniteBaselineAffinityTopologyActivationTest sometimes fails due to NPE on node stop Key: IGNITE-11615 URL: https://issues.apache.org/jira/browse/IGNITE-11615 Project: Ignite Issue Type: Test Reporter: Alexey Goncharuk Assignee: Alexey Goncharuk Fix For: 2.8 I observe two kind of NPEs in the test: The first one happens because we shutdown when MVCC coordinator is not assigned yet {code} java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.replaceNode(ConcurrentHashMap.java:1106) at java.util.concurrent.ConcurrentHashMap.remove(ConcurrentHashMap.java:1097) at org.apache.ignite.internal.processors.cache.mvcc.MvccProcessorImpl.onCoordinatorFailed(MvccProcessorImpl.java:527) at org.apache.ignite.internal.processors.cache.mvcc.MvccProcessorImpl.onKernalStop(MvccProcessorImpl.java:459) at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2335) at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2283) at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1194) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1992) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1683) at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1109) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:607) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:984) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:925) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:913) at org.apache.ignite.internal.processors.cache.persistence.IgniteBaselineAffinityTopologyActivationTest.startGridWithConsistentId(IgniteBaselineAffinityTopologyActivationTest.java:729) at org.apache.ignite.internal.processors.cache.persistence.IgniteBaselineAffinityTopologyActivationTest.testNodeWithBltIsNotAllowedToJoinClusterDuringFirstActivation(IgniteBaselineAffinityTopologyActivationTest.java:532) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.apache.ignite.testframework.junits.GridAbstractTest$6.run(GridAbstractTest.java:2102) at java.lang.Thread.run(Thread.java:745) {code} The second one happens because we shutdown before the preloader was initialized: {code} [13:24:30]W: [org.apache.ignite:ignite-core] java.lang.NullPointerException [13:24:30]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.processors.cache.CacheGroupContext.onKernalStop(CacheGroupContext.java:742) [13:24:30]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.processors.cache.GridCacheProcessor.onKernalStop(GridCacheProcessor.java:1158) [13:24:30]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2335) [13:24:30]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2283) [13:24:30]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2570) [13:24:30]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2533) [13:24:30]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:330) [13:24:30]W: [org.apache.ignite:ignite-core]at org.apache.ignite.Ignition.stop(Ignition.java:223) [13:24:30]W: [org.apache.ignite:ignite-core]at org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1187) [13:24:30]W: [org.apache.ignite:ignite-core]
[jira] [Created] (IGNITE-11613) GridSpringBeanSerializationSelfTest fails in master
Alexey Goncharuk created IGNITE-11613: - Summary: GridSpringBeanSerializationSelfTest fails in master Key: IGNITE-11613 URL: https://issues.apache.org/jira/browse/IGNITE-11613 Project: Ignite Issue Type: Test Reporter: Alexey Goncharuk >From the logs it's clear that the test fails because the node being started >picks up some other nodes through multicast and fails because of an >incompatible configuration. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11608) Document static persistent caches and DDL behavior
Alexey Goncharuk created IGNITE-11608: - Summary: Document static persistent caches and DDL behavior Key: IGNITE-11608 URL: https://issues.apache.org/jira/browse/IGNITE-11608 Project: Ignite Issue Type: Task Components: documentation Reporter: Alexey Goncharuk In IGNITE-11541 we changed the logic to ignore static cache configuration in favor of persisted cache config because the old behavior was incorrect in regards with DDL. A system property was introduced to keep the old behavior. These changes should be documented in https://apacheignite.readme.io/docs/cache-configuration -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11604) Drop column does not remove column from internal schema
Alexey Goncharuk created IGNITE-11604: - Summary: Drop column does not remove column from internal schema Key: IGNITE-11604 URL: https://issues.apache.org/jira/browse/IGNITE-11604 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk Discovered this during the work on IGNITE-11541 (see {{StaticCacheDdlTest.testDropColumn}}). After a quick debug I see the following: in {{GridQueryProcessor#onSchemaFinishDiscovery}} we call {{DynamicCacheDescriptor.schemaChangeFinish(msg)}}, which eventually calls {{QuerySchema.finish(op)}}. Inside that method we have the following: the {{message.columns()}} collection has the field name in upper case (in the test it's "FIELD_TO_DROP"), but the entity's fields map contains lower-case names ("field_to_drop"). As a result, we do not remove the field from the query entity, this query entity is saved to the stored cache data and field re-appears after restart. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11573) Incompatible baseline topology triggers failure handler
Alexey Goncharuk created IGNITE-11573: - Summary: Incompatible baseline topology triggers failure handler Key: IGNITE-11573 URL: https://issues.apache.org/jira/browse/IGNITE-11573 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk See logs in {{testNodeFailsToJoinWithIncompatiblePreviousBaselineTopology}}: the SPI throws a valid exception, but together with throwing the exception to user we trigger failure handler. Given that the exception is valid, failure handler should not be triggered. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11572) Node restart in ignite.sh was broken by IGNITE-11216
Alexey Goncharuk created IGNITE-11572: - Summary: Node restart in ignite.sh was broken by IGNITE-11216 Key: IGNITE-11572 URL: https://issues.apache.org/jira/browse/IGNITE-11572 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11555) Unable to await partitions release latch caused by coordinator failover
Alexey Goncharuk created IGNITE-11555: - Summary: Unable to await partitions release latch caused by coordinator failover Key: IGNITE-11555 URL: https://issues.apache.org/jira/browse/IGNITE-11555 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11546) FileDownloader may be closed earlier than the file is downloaded
Alexey Goncharuk created IGNITE-11546: - Summary: FileDownloader may be closed earlier than the file is downloaded Key: IGNITE-11546 URL: https://issues.apache.org/jira/browse/IGNITE-11546 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11468) System caches should not survive client reconnect
Alexey Goncharuk created IGNITE-11468: - Summary: System caches should not survive client reconnect Key: IGNITE-11468 URL: https://issues.apache.org/jira/browse/IGNITE-11468 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk After IGNITE-10898 it became clear that system caches surviving client reconnect lead are error-prone: there are too many specific code paths leading to logic duplication and exclusions. At the same time, the benefit of system caches survival is not tangible. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11454) Race in ClientImpl may lead to client node segmentation on fast reconnect
Alexey Goncharuk created IGNITE-11454: - Summary: Race in ClientImpl may lead to client node segmentation on fast reconnect Key: IGNITE-11454 URL: https://issues.apache.org/jira/browse/IGNITE-11454 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11432) Add ability to specify auto-generated consistent ID in IgniteConfiguration
Alexey Goncharuk created IGNITE-11432: - Summary: Add ability to specify auto-generated consistent ID in IgniteConfiguration Key: IGNITE-11432 URL: https://issues.apache.org/jira/browse/IGNITE-11432 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11394) Infinite No next node in topology messages during node restart scenario
Alexey Goncharuk created IGNITE-11394: - Summary: Infinite No next node in topology messages during node restart scenario Key: IGNITE-11394 URL: https://issues.apache.org/jira/browse/IGNITE-11394 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11392) Improve LRT diagnostic messages
Alexey Goncharuk created IGNITE-11392: - Summary: Improve LRT diagnostic messages Key: IGNITE-11392 URL: https://issues.apache.org/jira/browse/IGNITE-11392 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk Fix For: 2.8 Currently we print out only local information about long-running transactions. This makes it hard to understand the cause of the LRT when an ACTIVE transaction is initiated by a client in a large system and is not being committed. Given that a primary node knows the near node ID, we can send a diagnostic message to the near node for an ACTIVE transaction, find the thread that started the transaction and dump it's stack, so the server node logs can at least give an idea why the transaction is not being committed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11272) Investigate race between local node promotion to the first node in topology and NodeAddedMessage
Alexey Goncharuk created IGNITE-11272: - Summary: Investigate race between local node promotion to the first node in topology and NodeAddedMessage Key: IGNITE-11272 URL: https://issues.apache.org/jira/browse/IGNITE-11272 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk The following race is possible - a node sends join request, succeeds, but does not receive {{NodeAddFinishedMessage}} on time, after that due to a short-time connection break it fails to send join request and decides to promote itself to a first node in the topology. At the same time, the network may be restored and {{NodeAddedMessage}} may be received by the local node. The behavior is currently undefined. This should be tested and fixed if needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11271) Investigate setting discardCustomMsgId to null in prepareNodeAddedMessage
Alexey Goncharuk created IGNITE-11271: - Summary: Investigate setting discardCustomMsgId to null in prepareNodeAddedMessage Key: IGNITE-11271 URL: https://issues.apache.org/jira/browse/IGNITE-11271 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk >From debugging IGNITE-10935 it was discovered that NodeAddedMessage contains >wrong state: pending messages are already filtered out by discard ID, but at >the same time discardId and customDiscardId are set to non-null values. This >resulted in a broken pending messages iterator on a newly added node: >SkipIterator was skipping all pending messages until a valid discardId was >received. The fix made in IGNITE-10935 was incomplete because we should have set both discardId and customDiscardId to null. However, after running TC tests it turned out that setting discardCustomMsgId to null resulted in duplicate custom events (the particular failed test is AuthenticationProcessorNodeRestartTest#testConcurrentAddUpdateRemoveNodeRestartServer) The reason behind the failed test is that some of the fired custom events are delivered multiple times. This should be investigated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11214) IgniteCacheAtomicPutAllFailoverSelfTest fails with "Cannot find cache"
Alexey Goncharuk created IGNITE-11214: - Summary: IgniteCacheAtomicPutAllFailoverSelfTest fails with "Cannot find cache" Key: IGNITE-11214 URL: https://issues.apache.org/jira/browse/IGNITE-11214 Project: Ignite Issue Type: Test Reporter: Alexey Goncharuk Fix For: 2.8 In the test local node does not have a cache started, but calls {{affinity.map...}} which causes the node to fetch affinity from the cluster. Due to a race the local node may observe {{readyAffinityVer == AffinityTopologyVersion(1,0)}} which is fine, but it requests the affinity using this topology version, which results in a missed affinity exception. Suggested solution is to use {{max(readyAffinityVersion, discovery.topologyVersionEx)}} when fetching affinity from remote nodes (a node does not have the cache context started). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11204) Merged partitions exchange future ignores NODE_LEFT/FAILED events for merged exchanges
Alexey Goncharuk created IGNITE-11204: - Summary: Merged partitions exchange future ignores NODE_LEFT/FAILED events for merged exchanges Key: IGNITE-11204 URL: https://issues.apache.org/jira/browse/IGNITE-11204 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk In {{GridDhtPartitionsExchangeFuture#onNodeLeft}} we have the following code: {code} if (!srvNodes.remove(node) return; {code} The issue is that the {{srvNodes}} collection is created when partition exchange future is initialized. After the exchange future is merged, we will wait for more nodes to respond. However, since those nodes never added to {{srvNodes}}, the event will never be processed and the exchange future will hang. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11050) Potential deadlock caused by DhtColocatedLockFuture#map being called inside topology read lock
Alexey Goncharuk created IGNITE-11050: - Summary: Potential deadlock caused by DhtColocatedLockFuture#map being called inside topology read lock Key: IGNITE-11050 URL: https://issues.apache.org/jira/browse/IGNITE-11050 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk I observed the following stacktrace on TC during tests analysis: {code} Thread [name="exchange-worker-#18471%near.GridCachePartitionedNodeRestartTest0%", id=23715, state=WAITING, blockCnt=860, waitCnt=775] Lock [object=java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@2bfb6b49, ownerName=null, ownerId=-1] at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) at o.a.i.i.util.StripedCompositeReadWriteLock$WriteLock.lock0(StripedCompositeReadWriteLock.java:173) at o.a.i.i.util.StripedCompositeReadWriteLock$WriteLock.lock(StripedCompositeReadWriteLock.java:142) at o.a.i.i.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.localPartition0(GridDhtPartitionTopologyImpl.java:925) at o.a.i.i.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.localPartition(GridDhtPartitionTopologyImpl.java:826) at o.a.i.i.processors.cache.distributed.dht.GridCachePartitionedConcurrentMap.localPartition(GridCachePartitionedConcurrentMap.java:70) at o.a.i.i.processors.cache.distributed.dht.GridCachePartitionedConcurrentMap.putEntryIfObsoleteOrAbsent(GridCachePartitionedConcurrentMap.java:89) at o.a.i.i.processors.cache.GridCacheAdapter.entryEx(GridCacheAdapter.java:1019) at o.a.i.i.processors.cache.distributed.dht.GridDhtCacheAdapter.entryEx(GridDhtCacheAdapter.java:544) at o.a.i.i.processors.cache.transactions.IgniteTxManager.txUnlock(IgniteTxManager.java:1764) at o.a.i.i.processors.cache.transactions.IgniteTxManager.unlockMultiple(IgniteTxManager.java:1775) at o.a.i.i.processors.cache.transactions.IgniteTxManager.rollbackTx(IgniteTxManager.java:1347) at o.a.i.i.processors.cache.transactions.IgniteTxLocalAdapter.userRollback(IgniteTxLocalAdapter.java:1075) at o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.localFinish(GridNearTxLocal.java:3602) at o.a.i.i.processors.cache.distributed.near.GridNearTxFinishFuture.doFinish(GridNearTxFinishFuture.java:440) at o.a.i.i.processors.cache.distributed.near.GridNearTxFinishFuture.finish(GridNearTxFinishFuture.java:390) at o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.rollbackNearTxLocalAsync(GridNearTxLocal.java:3833) at o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.rollbackNearTxLocalAsync(GridNearTxLocal.java:3784) at o.a.i.i.processors.cache.GridCacheAdapter$53.applyx(GridCacheAdapter.java:4409) at o.a.i.i.processors.cache.GridCacheAdapter$53.applyx(GridCacheAdapter.java:4399) at o.a.i.i.util.lang.IgniteClosureX.apply(IgniteClosureX.java:38) at o.a.i.i.util.future.GridFutureChainListener.applyCallback(GridFutureChainListener.java:78) at o.a.i.i.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:70) at o.a.i.i.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:30) at o.a.i.i.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399) at o.a.i.i.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347) at o.a.i.i.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335) at o.a.i.i.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:511) at o.a.i.i.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:490) at o.a.i.i.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:478) at o.a.i.i.util.future.GridFutureChainListener.applyCallback(GridFutureChainListener.java:81) at o.a.i.i.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:70) at o.a.i.i.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:30) at o.a.i.i.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399) at o.a.i.i.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347) at o.a.i.i.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335)
[jira] [Created] (IGNITE-10898) Exchange coordinator failover breaks in some cases when node filter is used
Alexey Goncharuk created IGNITE-10898: - Summary: Exchange coordinator failover breaks in some cases when node filter is used Key: IGNITE-10898 URL: https://issues.apache.org/jira/browse/IGNITE-10898 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10820) Clean up static Ignite instances in tests
Alexey Goncharuk created IGNITE-10820: - Summary: Clean up static Ignite instances in tests Key: IGNITE-10820 URL: https://issues.apache.org/jira/browse/IGNITE-10820 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk >From the recent TC OOME it can be seen that some tests contain static Ignite >instances that are not cleaned up on test end, which leads to memory leak. As >a first phase, we need to nullify them. Ultimately, we need to get rid of >static fields in tests after migration to JUnit5. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10581) Document new flag to filter cache types in control.sh
Alexey Goncharuk created IGNITE-10581: - Summary: Document new flag to filter cache types in control.sh Key: IGNITE-10581 URL: https://issues.apache.org/jira/browse/IGNITE-10581 Project: Ignite Issue Type: Task Components: documentation Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10495) Get rid of local node ID in IgniteConfiguration
Alexey Goncharuk created IGNITE-10495: - Summary: Get rid of local node ID in IgniteConfiguration Key: IGNITE-10495 URL: https://issues.apache.org/jira/browse/IGNITE-10495 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk Fix For: 3.0 Partially the ticket is motivated by IGNITE-10484. Currently the local node ID is stored in the instance of local node which sometimes may cause tricky deadlocks. This node ID may be regenerated on clients when the client reconnects to the cluster. On the other hand, we have a deprecated method {{IgniteConfiguration#getNodeId}} which is heavily used in the codebase, which obviously breaks after the client reconnect. I suggest to remove the node ID from the configuration in 3.0 since it is already deprecated, and move the local node ID from the node instance to {{IgniteKernal}} so that we do not need the SPI context to get this ID. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10484) Activate/deactivate cluster suite hangs sporadically
Alexey Goncharuk created IGNITE-10484: - Summary: Activate/deactivate cluster suite hangs sporadically Key: IGNITE-10484 URL: https://issues.apache.org/jira/browse/IGNITE-10484 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk Assignee: Alexey Goncharuk I saw the following thread dump on TC (only relevant parts are kept): {code} "exchange-worker-#10918%cache.IgniteClusterActivateDeactivateTest0%" #13121 prio=5 os_prio=0 tid=0x7f0720137800 nid=0xbcf runnable [0x7f0b46f66000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:197) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) - locked <0xdf6b3f88> (a java.lang.Object) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.safeTcpHandshake(TcpCommunicationSpi.java:3676) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3323) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2991) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2872) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2715) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2674) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1655) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:1706) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.sendDiagnosticMessage(ClusterProcessor.java:614) at org.apache.ignite.internal.processors.cluster.ClusterProcessor.requestDiagnosticInfo(ClusterProcessor.java:556) at org.apache.ignite.internal.IgniteDiagnosticPrepareContext.send(IgniteDiagnosticPrepareContext.java:131) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.dumpDebugInfo(GridCachePartitionExchangeManager.java:1914) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2914) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2721) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) at java.lang.Thread.run(Thread.java:748) ... "start-node-3" #13223 prio=5 os_prio=0 tid=0x7f08a8001800 nid=0xc30 waiting on condition [0x7f0a577f5000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141) at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1099) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2040) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1732) - locked <0x959ae1d0> (a org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance) at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1158) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:656) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:959) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:900) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:888) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:854) at org.apache.ignite.internal.processors.cache.IgniteClusterActivateDeactivateTest.lambda$testConcurrentJoinAndActivate$4(IgniteClusterActivateDeactivateTest.java:601) at org.apache.ignite.internal.processors.cache.IgniteClusterActivateDeactivateTest$$Lambda$183/97479.call(Unknown Source) at org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:84) ... "grid-nio-worker-tcp-comm-3-#11059%cache.IgniteClusterActivateDeactivateTest5%" #13297 prio=5 os_prio=0 tid=0x7f08f809f000 nid=0xc83 waiting on condition [0x7f0a4688d000]
[jira] [Created] (IGNITE-10390) BPlusTree#isEmpty() does not release root page
Alexey Goncharuk created IGNITE-10390: - Summary: BPlusTree#isEmpty() does not release root page Key: IGNITE-10390 URL: https://issues.apache.org/jira/browse/IGNITE-10390 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10381) U.doInParallel can terminate early due to an error in batch processing
Alexey Goncharuk created IGNITE-10381: - Summary: U.doInParallel can terminate early due to an error in batch processing Key: IGNITE-10381 URL: https://issues.apache.org/jira/browse/IGNITE-10381 Project: Ignite Issue Type: Improvement Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10348) Safely recreate metastore to mitigate IGNITE-8735
Alexey Goncharuk created IGNITE-10348: - Summary: Safely recreate metastore to mitigate IGNITE-8735 Key: IGNITE-10348 URL: https://issues.apache.org/jira/browse/IGNITE-10348 Project: Ignite Issue Type: Improvement Affects Versions: 2.4 Reporter: Alexey Goncharuk Fix For: 2.8 We've fixed the issue IGNITE-8735, so new Ignite deployments are not affected by this issue, but old deployments still may fail if a wrong page was already put to the free list. We need to find out a way to repair this situation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10301) GridToStringBuilder is broken for classes with inheritance
Alexey Goncharuk created IGNITE-10301: - Summary: GridToStringBuilder is broken for classes with inheritance Key: IGNITE-10301 URL: https://issues.apache.org/jira/browse/IGNITE-10301 Project: Ignite Issue Type: Bug Affects Versions: 2.7 Reporter: Alexey Goncharuk Fix For: 2.7 Given the following class hierarchy {code} /** */ private static class Parent { /** */ private int a; /** {@inheritDoc} */ @Override public String toString() { return S.toString(Parent.class, this); } } /** */ private static class Child extends Parent { /** */ private int b; /** {@inheritDoc} */ @Override public String toString() { return S.toString(Child.class, this, super.toString()); } } private static class Wrapper { /** */ @GridToStringInclude Parent p = new Child(); /** {@inheritDoc} */ @Override public String toString() { return S.toString(Wrapper.class, this); } } {code} the next test fails: {code} /** */ public void testHierarchy() { Wrapper w = new Wrapper(); Parent p = w.p; String wS = w.toString(); String pS = p.toString(); // Expect wS to be "Wrapper [p=" + pS + ']'. assertEquals("Wrapper [p=" + pS + ']', wS); } {code} {code} Expected :Wrapper [p=Child [b=0, super=Parent [a=0]]] Actual :Wrapper [p=Parent [a=0]Child [b=0, super=]] {code} This is a regression from IGNITE-602. We need to fix this in 2.7 or revert IGNITE-602. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10238) Intermittent Client Nodes suite hang
Alexey Goncharuk created IGNITE-10238: - Summary: Intermittent Client Nodes suite hang Key: IGNITE-10238 URL: https://issues.apache.org/jira/browse/IGNITE-10238 Project: Ignite Issue Type: Test Environment: There are occasional hangs of Client Nodes suite in master. A quick peek at the thread dumps reveals an interesting deadlock (only relevant parts of the thread dump are left): {code} "disco-notifier-worker-#634%internal.IgniteClientReconnectApiExceptionTest0%" #791 prio=5 os_prio=0 tid=0x7f990c12d800 nid=0x11b9 waiting on condition [0x7f991a0eb000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141) at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.metadata(CacheObjectBinaryProcessorImpl.java:656) at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl$1.metadata(CacheObjectBinaryProcessorImpl.java:206) at org.apache.ignite.internal.binary.BinaryContext.metadata(BinaryContext.java:1293) at org.apache.ignite.internal.binary.BinaryReaderExImpl.getOrCreateSchema(BinaryReaderExImpl.java:2007) at org.apache.ignite.internal.binary.BinaryReaderExImpl.(BinaryReaderExImpl.java:286) at org.apache.ignite.internal.binary.BinaryReaderExImpl.(BinaryReaderExImpl.java:185) at org.apache.ignite.internal.binary.BinaryReaderExImpl.readField(BinaryReaderExImpl.java:1984) at org.apache.ignite.internal.binary.BinaryFieldAccessor$DefaultFinalClassAccessor.read0(BinaryFieldAccessor.java:703) at org.apache.ignite.internal.binary.BinaryFieldAccessor.read(BinaryFieldAccessor.java:188) at org.apache.ignite.internal.binary.BinaryClassDescriptor.read(BinaryClassDescriptor.java:874) at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0(BinaryReaderExImpl.java:1764) at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize(BinaryReaderExImpl.java:1716) at org.apache.ignite.internal.binary.BinaryReaderExImpl.readField(BinaryReaderExImpl.java:1984) at org.apache.ignite.internal.binary.BinaryFieldAccessor$DefaultFinalClassAccessor.read0(BinaryFieldAccessor.java:703) at org.apache.ignite.internal.binary.BinaryFieldAccessor.read(BinaryFieldAccessor.java:188) at org.apache.ignite.internal.binary.BinaryClassDescriptor.read(BinaryClassDescriptor.java:874) at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0(BinaryReaderExImpl.java:1764) at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize(BinaryReaderExImpl.java:1716) at org.apache.ignite.internal.binary.GridBinaryMarshaller.deserialize(GridBinaryMarshaller.java:313) at org.apache.ignite.internal.binary.BinaryMarshaller.unmarshal0(BinaryMarshaller.java:101) at org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.unmarshal(AbstractNodeNameAwareMarshaller.java:81) at org.apache.ignite.internal.util.IgniteUtils.unmarshal(IgniteUtils.java:10131) at org.apache.ignite.internal.util.IgniteUtils.unmarshal(IgniteUtils.java:10160) at org.apache.ignite.internal.GridEventConsumeHandler.p2pUnmarshal(GridEventConsumeHandler.java:390) at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartRequest(GridContinuousProcessor.java:1362) at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$400(GridContinuousProcessor.java:111) at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:203) at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:194) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:725) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:602) - locked <0x0007b62859b8> (a java.lang.Object) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4$$Lambda$17/432384581.run(Unknown Source) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2665) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2703) at
[jira] [Created] (IGNITE-10237) Inspections build is broken in master
Alexey Goncharuk created IGNITE-10237: - Summary: Inspections build is broken in master Key: IGNITE-10237 URL: https://issues.apache.org/jira/browse/IGNITE-10237 Project: Ignite Issue Type: Test Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10123) Intermittent OOME errors in PDS indexing tests
Alexey Goncharuk created IGNITE-10123: - Summary: Intermittent OOME errors in PDS indexing tests Key: IGNITE-10123 URL: https://issues.apache.org/jira/browse/IGNITE-10123 Project: Ignite Issue Type: Test Reporter: Alexey Goncharuk Fix For: 2.8 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10094) TC: Introduce overnight builds
Alexey Goncharuk created IGNITE-10094: - Summary: TC: Introduce overnight builds Key: IGNITE-10094 URL: https://issues.apache.org/jira/browse/IGNITE-10094 Project: Ignite Issue Type: Task Reporter: Alexey Goncharuk Creating this ticket to collect all efforts on shortening a single TC run and introduce overnight TC runs. >From the infrastructure side, we need to create a separate run configuration >(for example, Run All Nightly). To begin, Run All Nightly will delegate to Run >All and later we will move several long-running suites to the nightly run. >Nightly Run All should have a nightly trigger. >From the TC bot side, we need to configure it to push nightly builds when TC >is idle and additionally to track new failures in nightly runs. >From the code side, we need to define an environment property that should >distinguish a quick run from the nightly run. Later this property will be used >to scale tests duration. [~dpavlov], [~sergey-chugunov], [~vveider], can you chime in? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10068) Update documentation for username and password handling in control.sh
Alexey Goncharuk created IGNITE-10068: - Summary: Update documentation for username and password handling in control.sh Key: IGNITE-10068 URL: https://issues.apache.org/jira/browse/IGNITE-10068 Project: Ignite Issue Type: Task Components: documentation Reporter: Alexey Goncharuk Need to update documentation on ./control.sh utility handling username and password according to the linked change. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9999) Add verbose logging for node recovery
Alexey Goncharuk created IGNITE-: Summary: Add verbose logging for node recovery Key: IGNITE- URL: https://issues.apache.org/jira/browse/IGNITE- Project: Ignite Issue Type: Task Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9996) Investigate possible performance drop in FSYNC mode for ignite-2.7 compared to ignite-2.6
Alexey Goncharuk created IGNITE-9996: Summary: Investigate possible performance drop in FSYNC mode for ignite-2.7 compared to ignite-2.6 Key: IGNITE-9996 URL: https://issues.apache.org/jira/browse/IGNITE-9996 Project: Ignite Issue Type: Task Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9943) Update documentation for default WAL archive size (added auto-adjust)
Alexey Goncharuk created IGNITE-9943: Summary: Update documentation for default WAL archive size (added auto-adjust) Key: IGNITE-9943 URL: https://issues.apache.org/jira/browse/IGNITE-9943 Project: Ignite Issue Type: Bug Reporter: Alexey Goncharuk -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9895) DiscoveryMessageNotifierWorker must be instanceof IgniteDiscoveryThread
Alexey Goncharuk created IGNITE-9895: Summary: DiscoveryMessageNotifierWorker must be instanceof IgniteDiscoveryThread Key: IGNITE-9895 URL: https://issues.apache.org/jira/browse/IGNITE-9895 Project: Ignite Issue Type: Bug Affects Versions: 2.7 Reporter: Alexey Goncharuk Fix For: 2.7 This is a regression from IGNITE-9398. The newly added thread must implement the marker interface, otherwise it is possible for a blocking future get inside of discovery worker, which leads to a cluster-wide deadlock: {code} "disco-notyfier-worker-#625%internal.IgniteClientReconnectApiExceptionTest0%" #770 prio=5 os_prio=0 tid=0x7f479c263800 nid=0x209b waiting on condition [0x7f49287ec000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140) at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.metadata(CacheObjectBinaryProcessorImpl.java:579) at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl$2.metadata(CacheObjectBinaryProcessorImpl.java:197) at org.apache.ignite.internal.binary.BinaryContext.metadata(BinaryContext.java:1283) at org.apache.ignite.internal.binary.BinaryReaderExImpl.getOrCreateSchema(BinaryReaderExImpl.java:2007) at org.apache.ignite.internal.binary.BinaryReaderExImpl.(BinaryReaderExImpl.java:286) at org.apache.ignite.internal.binary.BinaryReaderExImpl.(BinaryReaderExImpl.java:185) at org.apache.ignite.internal.binary.BinaryReaderExImpl.readField(BinaryReaderExImpl.java:1984) at org.apache.ignite.internal.binary.BinaryFieldAccessor$DefaultFinalClassAccessor.read0(BinaryFieldAccessor.java:698) at org.apache.ignite.internal.binary.BinaryFieldAccessor.read(BinaryFieldAccessor.java:183) at org.apache.ignite.internal.binary.BinaryClassDescriptor.read(BinaryClassDescriptor.java:870) at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0(BinaryReaderExImpl.java:1764) at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize(BinaryReaderExImpl.java:1716) at org.apache.ignite.internal.binary.BinaryReaderExImpl.readField(BinaryReaderExImpl.java:1984) at org.apache.ignite.internal.binary.BinaryFieldAccessor$DefaultFinalClassAccessor.read0(BinaryFieldAccessor.java:698) at org.apache.ignite.internal.binary.BinaryFieldAccessor.read(BinaryFieldAccessor.java:183) at org.apache.ignite.internal.binary.BinaryClassDescriptor.read(BinaryClassDescriptor.java:870) at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0(BinaryReaderExImpl.java:1764) at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize(BinaryReaderExImpl.java:1716) at org.apache.ignite.internal.binary.GridBinaryMarshaller.deserialize(GridBinaryMarshaller.java:310) at org.apache.ignite.internal.binary.BinaryMarshaller.unmarshal0(BinaryMarshaller.java:99) at org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.unmarshal(AbstractNodeNameAwareMarshaller.java:82) at org.apache.ignite.internal.util.IgniteUtils.unmarshal(IgniteUtils.java:10014) at org.apache.ignite.internal.util.IgniteUtils.unmarshal(IgniteUtils.java:10043) at org.apache.ignite.internal.GridMessageListenHandler.p2pUnmarshal(GridMessageListenHandler.java:194) at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartRequest(GridContinuousProcessor.java:1331) at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$400(GridContinuousProcessor.java:108) at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:200) at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:191) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:721) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:600) - locked <0x0007860b5c70> (a java.lang.Object) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4$$Lambda$10/346299427.run(Unknown Source) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifyerWorker.body0(GridDiscoveryManager.java:2681) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifyerWorker.body(GridDiscoveryManager.java:2719) at