[jira] [Assigned] (KUDU-1945) Support generation of surrogate primary keys (or tables with no PK)
[ https://issues.apache.org/jira/browse/KUDU-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-1945: - Assignee: (was: Grant Henke) > Support generation of surrogate primary keys (or tables with no PK) > --- > > Key: KUDU-1945 > URL: https://issues.apache.org/jira/browse/KUDU-1945 > Project: Kudu > Issue Type: New Feature > Components: client, master, tablet >Reporter: Todd Lipcon >Priority: Major > Labels: roadmap-candidate > > Many use cases have data where there is no "natural" primary key. For > example, a web log use case mostly cares about partitioning and not about > precise sorting by timestamp, and timestamps themselves are not necessarily > unique. Rather than forcing users to come up with their own surrogate primary > keys, Kudu should support some kind of "auto_increment" equivalent which > generates primary keys on insertion. Alternatively, Kudu could support tables > which are partitioned but not internally sorted. > The advantages would be: > - Kudu can pick primary keys on insertion to guarantee that there is no > compaction required on the table (eg always assign a new key higher than any > existing key in the local tablet). This can improve write throughput > substantially, especially compared to naive PK generation schemes that a user > might pick such as UUID, which would generate a uniform random-insert > workload (worst case for performance) > - Make Kudu easier to use for such use cases (no extra client code necessary) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-2858) Update docker readme to be more user focused
[ https://issues.apache.org/jira/browse/KUDU-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-2858: - Assignee: (was: Grant Henke) > Update docker readme to be more user focused > > > Key: KUDU-2858 > URL: https://issues.apache.org/jira/browse/KUDU-2858 > Project: Kudu > Issue Type: Improvement > Components: docker, documentation >Reporter: Grant Henke >Priority: Major > Labels: docker > > Now that the docker images are being published, we should update the readme > to focus less on building the images and more on using the already built > images. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-3172) Enable hybrid clock and built-in NTP client in Docker by default
[ https://issues.apache.org/jira/browse/KUDU-3172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-3172: - Assignee: (was: Grant Henke) > Enable hybrid clock and built-in NTP client in Docker by default > > > Key: KUDU-3172 > URL: https://issues.apache.org/jira/browse/KUDU-3172 > Project: Kudu > Issue Type: Improvement >Affects Versions: 1.12.0 >Reporter: Grant Henke >Priority: Minor > > Currently the docker entrypoint sets `--use_hybrid_clock=false` by default. > This can cause unusual issues when snapshot scans are needed. Now that the > built-in client is available we should switch to use that by default in the > docker image by setting `--time_source=auto`. > For the quickstart cluster we can use `--time_source=system_unsync` given we > expect all nodes will be on the same machine. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-2788) Validate metadata across backup and restore jobs
[ https://issues.apache.org/jira/browse/KUDU-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-2788: - Assignee: (was: Grant Henke) > Validate metadata across backup and restore jobs > > > Key: KUDU-2788 > URL: https://issues.apache.org/jira/browse/KUDU-2788 > Project: Kudu > Issue Type: Improvement >Affects Versions: 1.9.0 >Reporter: Grant Henke >Priority: Critical > Labels: backup > > Currently the backup and restore jobs assume the metadata hasn't changed or > has changed in a compatible way across runs. We should validate that this is > true when building the backup graph and handle as many metadata changes as > possible. > The metadata changes that can't be handled should be clearly documented and a > follow up Jira filed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-3211) Add a cluster supported features request
[ https://issues.apache.org/jira/browse/KUDU-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-3211: - Assignee: (was: Grant Henke) > Add a cluster supported features request > > > Key: KUDU-3211 > URL: https://issues.apache.org/jira/browse/KUDU-3211 > Project: Kudu > Issue Type: Improvement > Components: master, supportability >Affects Versions: 1.13.0 >Reporter: Grant Henke >Priority: Major > > Recently we have come across a few scenarios where it would be useful to make > decisions in client integrations (Backup/Restore, Spark, NiFi, Impala) based > on the supported features of the target Kudu cluster. This can especially > helpful when we want to use new features by default if available but using > the new feature requires client/integration logic changes. > Some recent examples: > - Push bloomfilter predicates only if supported > - Use insert ignore operations (vs session based ignore) only if supported > It is technically possible to be optimistic about the support of a feature > and try to handle errors in a clever way using the required feature > capabilities of the RPCs. However, that can be difficult to express and near > impossible if you want to make a decision for multiple requests or based on > what all tablet servers support instead of based on a single request to a > single tablet server. > Additionally now that we support rolling restart, we can't assume that > because a single master or tablet server supports a feature that all servers > in the cluster support the feature. > Some thoughts on the feature/implementation: > - This should be a master request in order to prevent needing to talk to all > the tablet servers. > - We could leverage server registration requests or heartbeats to aggregate > the current state on the leader master. > - We could represent these features as "cluster" level features and indicate > that some (union) or all (intersect) of the servers support a given feature. > - If this request/response is not available in a cluster the response would > indicate that feature support is unknown and the user can decide how to > proceed. > - If we want to support disabling features via runtime flags we will need to > ensure we update the master, maybe via heartbeat, with changed support for a > running server. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-2500) Kudu Spark InterfaceStability class not found
[ https://issues.apache.org/jira/browse/KUDU-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-2500: - Assignee: (was: Grant Henke) > Kudu Spark InterfaceStability class not found > - > > Key: KUDU-2500 > URL: https://issues.apache.org/jira/browse/KUDU-2500 > Project: Kudu > Issue Type: Bug > Components: spark >Affects Versions: 1.7.0 >Reporter: Grant Henke >Priority: Major > > We recently marked the Yetus annotation library as optional because the > annotations are not used at runtime and therefore should not be needed. Here > is a good summary of why the annotations are not required at runtime: > https://stackoverflow.com/questions/3567413/why-doesnt-a-missing-annotation-cause-a-classnotfoundexception-at-runtime/3568041#3568041 > However, for some reason Spark is requiring the annotation when performing > some reflection. See the sample stacktrace below: > {code} > Driver stacktrace: > at > org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1602) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1590) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1589) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at > org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1589) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831) > at scala.Option.foreach(Option.scala:257) > at > org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1823) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1772) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1761) > at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) > at > org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:2034) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:2055) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:2074) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:2099) > at > org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:929) > at > org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:927) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) > at org.apache.spark.rdd.RDD.withScope(RDD.scala:363) > at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:927) > at > org.apache.spark.sql.Dataset$$anonfun$foreachPartition$1.apply$mcV$sp(Dataset.scala:2675) > at > org.apache.spark.sql.Dataset$$anonfun$foreachPartition$1.apply(Dataset.scala:2675) > at > org.apache.spark.sql.Dataset$$anonfun$foreachPartition$1.apply(Dataset.scala:2675) > at > org.apache.spark.sql.Dataset$$anonfun$withNewRDDExecutionId$1.apply(Dataset.scala:3239) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) > at > org.apache.spark.sql.Dataset.withNewRDDExecutionId(Dataset.scala:3235) > at org.apache.spark.sql.Dataset.foreachPartition(Dataset.scala:2674) > at > org.apache.kudu.spark.kudu.KuduContext.writeRows(KuduContext.scala:276) > at > org.apache.kudu.spark.kudu.KuduContext.insertRows(KuduContext.scala:206) > at > org.apache.kudu.backup.KuduRestore$$anonfun$run$1.apply(KuduRestore.scala:65) > at > org.apache.kudu.backup.KuduRestore$$anonfun$run$1.apply(KuduRestore.scala:44) > at scala.collection.immutable.List.foreach(List.scala:392) > at org.apache.kudu.backup.KuduRestore$.run(KuduRestore.scala:44) > at > org.apache.kudu.backup.TestKuduBackup.backupAndRestore(TestKuduBackup.scala:310) > at > org.apache.kudu.backup.TestKuduBackup$$anonfun$2.apply$mcV$sp(TestKuduBackup.scala:83) > at > org.apache.kudu.backup.TestKuduBackup$$anonfun$2.apply(TestKuduBackup.scala:76) > at > org.apache.kudu.backup.TestKuduBackup$$anonfun$2.apply(TestKuduBackup.scala:76) > at org.scalatest.OutcomeOf$class.outcomeOf(Out
[jira] [Assigned] (KUDU-982) nullable columns should support DEFAULT NULL
[ https://issues.apache.org/jira/browse/KUDU-982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-982: Assignee: (was: Grant Henke) > nullable columns should support DEFAULT NULL > > > Key: KUDU-982 > URL: https://issues.apache.org/jira/browse/KUDU-982 > Project: Kudu > Issue Type: Improvement > Components: api, client, master >Affects Versions: Private Beta >Reporter: Todd Lipcon >Priority: Major > > I don't think we have APIs which work for setting the default to NULL in > Alter/Create. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-3244) Build and publish kudu-binary via Gradle
[ https://issues.apache.org/jira/browse/KUDU-3244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-3244: - Assignee: (was: Grant Henke) > Build and publish kudu-binary via Gradle > > > Key: KUDU-3244 > URL: https://issues.apache.org/jira/browse/KUDU-3244 > Project: Kudu > Issue Type: Improvement > Components: build, test >Affects Versions: 1.14.0 >Reporter: Grant Henke >Priority: Major > > Now that the kudu-binary jar only uses the `kudu` binary > ([here|https://gerrit.cloudera.org/#/c/12523/]), we should be able to > simplify the build and release process of that jar, and build that jar inside > the Gradle build. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-3132) Support RPC compression
[ https://issues.apache.org/jira/browse/KUDU-3132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-3132: - Assignee: (was: Grant Henke) > Support RPC compression > --- > > Key: KUDU-3132 > URL: https://issues.apache.org/jira/browse/KUDU-3132 > Project: Kudu > Issue Type: New Feature > Components: perf, rpc >Reporter: Grant Henke >Priority: Major > Labels: performance, roadmap-candidate > > I have seen more and more deployments of Kudu where the tablet servers are > not co-located with the compute resources such as Impala or Spark. In > deployments like this, there could be significant network savings by > compressing the RPC messages (especially those that write or scan data). > Adding simple LZ4 or Snappy compression support to the RPC messages when not > on a loopback/local connection should be a great improvement for network > bound applications. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-2282) Support coercion of Decimal values
[ https://issues.apache.org/jira/browse/KUDU-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-2282: - Assignee: (was: Grant Henke) > Support coercion of Decimal values > --- > > Key: KUDU-2282 > URL: https://issues.apache.org/jira/browse/KUDU-2282 > Project: Kudu > Issue Type: Improvement >Affects Versions: 1.7.0 >Reporter: Grant Henke >Priority: Major > > Currently when decimal values are used in KuduValue.cc or PartialRow.cc we > enforce that the scale matches the expected scale. Instead we should support > basic coercion where no value rounding or truncating is required. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-3134) Adjust default value for --raft_heartbeat_interval
[ https://issues.apache.org/jira/browse/KUDU-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-3134: - Assignee: (was: Grant Henke) > Adjust default value for --raft_heartbeat_interval > -- > > Key: KUDU-3134 > URL: https://issues.apache.org/jira/browse/KUDU-3134 > Project: Kudu > Issue Type: Improvement >Affects Versions: 1.12.0 >Reporter: Grant Henke >Priority: Major > > Users often increase the `--raft_heartbeat_interval` on larger clusters or on > clusters with high replica counts. This helps avoid the servers flooding each > other with heartbeat RPCs causing queue overflows and using too much idle > CPU. Users have adjusted the values from 1.5 seconds to as high as 10s and we > have never seen people complain about problems after doing so. > Anecdotally, I recently saw a cluster with 4k tablets per tablet server using > ~150% cpu usage while idle. By increasing the `--raft_heartbeat_interval` > from 500ms to 1500ms the cpu usage dropped to ~50%. > Generally speaking users often care about Kudu stability and scalability over > an extremely short MTTR. Additionally our default client RPC timeouts of 30s > also seem to indicate slightly longer failover/retry times are tolerable in > the default case. > We should consider adjusting the default value of `--raft_heartbeat_interval` > to a higher value to support larger and more efficient clusters by default. > Users who need a low MTTR can always adjust the value lower while also > adjusting other related timeouts. We may also want to consider adjusting the > default `--heartbeat_interval_ms` accordingly. > Note: Batching the RPCs like mentioned in KUDU-1973 or providing a server to > server proxy for heartbeating may be a way to solve the issues without > adjusting the default configuration. However, adjusting the configuration is > easy and has proven effective in production deployments. Additionally > adjusting the defaults along with a KUDU-1973 like approach could lead to > even lower idle resource usage. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-3218) client_symbol-test fails on Centos 7 with devtoolset-8
[ https://issues.apache.org/jira/browse/KUDU-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-3218: - Assignee: (was: Grant Henke) > client_symbol-test fails on Centos 7 with devtoolset-8 > -- > > Key: KUDU-3218 > URL: https://issues.apache.org/jira/browse/KUDU-3218 > Project: Kudu > Issue Type: Bug > Components: client >Affects Versions: 1.14.0 >Reporter: Grant Henke >Priority: Major > > When running the client_symbol-test on Centos 7 with devtoolset-8 the test > fails with the following bad symbols: > {code:java} > Found bad symbol 'operator delete[](void*, unsigned long)' > Found bad symbol 'operator delete(void*, unsigned long)' > Found bad symbol 'transaction clone for std::logic_error::what() const' > Found bad symbol 'transaction clone for std::runtime_error::what() const' > Found bad symbol 'transaction clone for std::logic_error::logic_error(char > const*)' > Found bad symbol 'transaction clone for > std::logic_error::logic_error(std::__cxx11::basic_string std::char_traits, std::allocator > const&)' > Found bad symbol 'transaction clone for std::logic_error::logic_error(char > const*)' > Found bad symbol 'transaction clone for > std::logic_error::logic_error(std::__cxx11::basic_string std::char_traits, std::allocator > const&)' > Found bad symbol 'transaction clone for std::logic_error::~logic_error()' > Found bad symbol 'transaction clone for std::logic_error::~logic_error()' > Found bad symbol 'transaction clone for std::logic_error::~logic_error()' > Found bad symbol 'transaction clone for std::range_error::range_error(char > const*)' > Found bad symbol 'transaction clone for > std::range_error::range_error(std::__cxx11::basic_string std::char_traits, std::allocator > const&)' > Found bad symbol 'transaction clone for std::range_error::range_error(char > const*)' > Found bad symbol 'transaction clone for > std::range_error::range_error(std::__cxx11::basic_string std::char_traits, std::allocator > const&)' > Found bad symbol 'transaction clone for std::range_error::~range_error()' > Found bad symbol 'transaction clone for std::range_error::~range_error()' > Found bad symbol 'transaction clone for std::range_error::~range_error()' > Found bad symbol 'transaction clone for std::domain_error::domain_error(char > const*)' > Found bad symbol 'transaction clone for > std::domain_error::domain_error(std::__cxx11::basic_string std::char_traits, std::allocator > const&)' > Found bad symbol 'transaction clone for std::domain_error::domain_error(char > const*)' > Found bad symbol 'transaction clone for > std::domain_error::domain_error(std::__cxx11::basic_string std::char_traits, std::allocator > const&)' > Found bad symbol 'transaction clone for std::domain_error::~domain_error()' > Found bad symbol 'transaction clone for std::domain_error::~domain_error()' > Found bad symbol 'transaction clone for std::domain_error::~domain_error()' > Found bad symbol 'transaction clone for std::length_error::length_error(char > const*)' > Found bad symbol 'transaction clone for > std::length_error::length_error(std::__cxx11::basic_string std::char_traits, std::allocator > const&)' > Found bad symbol 'transaction clone for std::length_error::length_error(char > const*)' > Found bad symbol 'transaction clone for > std::length_error::length_error(std::__cxx11::basic_string std::char_traits, std::allocator > const&)' > Found bad symbol 'transaction clone for std::length_error::~length_error()' > Found bad symbol 'transaction clone for std::length_error::~length_error()' > Found bad symbol 'transaction clone for std::length_error::~length_error()' > Found bad symbol 'transaction clone for std::out_of_range::out_of_range(char > const*)' > Found bad symbol 'transaction clone for > std::out_of_range::out_of_range(std::__cxx11::basic_string std::char_traits, std::allocator > const&)' > Found bad symbol 'transaction clone for std::out_of_range::out_of_range(char > const*)' > Found bad symbol 'transaction clone for > std::out_of_range::out_of_range(std::__cxx11::basic_string std::char_traits, std::allocator > const&)' > Found bad symbol 'transaction clone for std::out_of_range::~out_of_range()' > Found bad symbol 'transaction clone for std::out_of_range::~out_of_range()' > Found bad symbol 'transaction clone for std::out_of_range::~out_of_range()' > Found bad symbol 'transaction clone for > std::runtime_error::runtime_error(char const*)' > Found bad symbol 'transaction clone for > std::runtime_error::runtime_error(std::__cxx11::basic_string std::char_traits, std::allocator > const&)' > Found bad symbol 'transaction clone for > std::runtime_error::runtime_error(char const*)' > Found bad symbol 'transaction clone for > std::runtime_error::runtime_error(std::__cxx11
[jira] [Assigned] (KUDU-2696) libgmock is linked into the kudu cli binary
[ https://issues.apache.org/jira/browse/KUDU-2696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-2696: - Assignee: (was: Grant Henke) > libgmock is linked into the kudu cli binary > --- > > Key: KUDU-2696 > URL: https://issues.apache.org/jira/browse/KUDU-2696 > Project: Kudu > Issue Type: Bug > Components: build >Affects Versions: 1.8.0 >Reporter: Mike Percy >Priority: Minor > > libgmock is linked into the kudu cli binary, even though we consider it a > test-only dependency. Possibly a configuration problem in our cmake files? > {code:java} > $ ldd build/dynclang/bin/kudu | grep mock > libgmock.so => > /home/mpercy/src/kudu/thirdparty/installed/uninstrumented/lib/libgmock.so > (0x7f01f1495000) > {code} > The gmock dependency does not appear in the server binaries, as expected. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-1261) Support nested data types
[ https://issues.apache.org/jira/browse/KUDU-1261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-1261: - Assignee: (was: Grant Henke) > Support nested data types > - > > Key: KUDU-1261 > URL: https://issues.apache.org/jira/browse/KUDU-1261 > Project: Kudu > Issue Type: New Feature >Reporter: Jean-Daniel Cryans >Priority: Major > Labels: limitations, roadmap-candidate > > AKA complex data types. > This is a common ask. I'm creating this jira so that we can at least start > tracking how people want to use it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-2860) Sign docker images
[ https://issues.apache.org/jira/browse/KUDU-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-2860: - Assignee: (was: Grant Henke) > Sign docker images > -- > > Key: KUDU-2860 > URL: https://issues.apache.org/jira/browse/KUDU-2860 > Project: Kudu > Issue Type: Improvement > Components: docker >Reporter: Grant Henke >Priority: Major > Labels: docker > > We should sign the Apache docker images following the instructions here: > [https://docs.docker.com/ee/dtr/user/manage-images/sign-images/] > > Ideally this would be handled by the build script. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-2530) Add kudu pbc replace tool
[ https://issues.apache.org/jira/browse/KUDU-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-2530: - Assignee: (was: Grant Henke) > Add kudu pbc replace tool > - > > Key: KUDU-2530 > URL: https://issues.apache.org/jira/browse/KUDU-2530 > Project: Kudu > Issue Type: Improvement > Components: CLI >Reporter: Grant Henke >Priority: Minor > > We currently have a _kudu pbc dump_ and a _kudu pbc edit_ tool. However, it > could be nice to edit the dumped file elsewhere and be able to load/replace > the dumped pbc with it. Adding _kudu pbc replace_ would make this easier. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-2820) No JUnit XMLs when running Java dist-test makes for frustrating precommit experience
[ https://issues.apache.org/jira/browse/KUDU-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-2820: - Assignee: (was: Grant Henke) > No JUnit XMLs when running Java dist-test makes for frustrating precommit > experience > > > Key: KUDU-2820 > URL: https://issues.apache.org/jira/browse/KUDU-2820 > Project: Kudu > Issue Type: Improvement > Components: test >Affects Versions: 1.10.0 >Reporter: Adar Dembo >Priority: Major > > When running Java tests in dist-test (as the precommit job does), JUnit XML > files aren't generated. That's because normally they're generated by Gradle, > but we don't run Gradle in the dist-test slaves; we run JUnit directly. > As a result, test failures don't propagate back to the Jenkins job, and you > have to click through a few links (console output --> link to dist-test job > --> filter failures only --> download the artifacts) to figure out what went > wrong. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-2524) scalafmt incompatible with jdk8 older than u25
[ https://issues.apache.org/jira/browse/KUDU-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-2524: - Assignee: (was: Grant Henke) > scalafmt incompatible with jdk8 older than u25 > -- > > Key: KUDU-2524 > URL: https://issues.apache.org/jira/browse/KUDU-2524 > Project: Kudu > Issue Type: Bug > Components: java >Affects Versions: 1.8.0 >Reporter: Adar Dembo >Priority: Major > > We're seeing a fair number of Gradle build failures in scalafmt with the > following output: > {noformat} > 1: Task failed with an exception. > --- > * What went wrong: > Execution failed for task ':kudu-spark:scalafmt'. > > Uninitialized object exists on backward branch 209 > Exception Details: > Location: > > scala/collection/immutable/HashMap$HashTrieMap.split()Lscala/collection/immutable/Seq; > @249: goto > Reason: > Error exists in the bytecode > Bytecode: > 000: 2ab6 005b 04a0 001e b200 b3b2 00b8 04bd > 010: 0002 5903 2a53 c000 bab6 00be b600 c2c0 > 020: 00c4 b02a b600 31b8 003b 3c1b 04a4 015e > 030: 1b05 6c3d 2a1b 056c 2ab6 0031 b700 c63e > 040: 2ab6 0031 021d 787e 3604 2ab6 0031 0210 > 050: 201d 647c 7e36 05bb 0014 59b2 00b8 2ab6 > 060: 0033 c000 bab6 00ca b700 cd1c b600 d13a > 070: 0619 06c6 001a 1906 b600 d5c0 0081 3a07 > 080: 1906 b600 d8c0 0081 3a08 a700 0dbb 00da > 090: 5919 06b7 00dd bf19 073a 0919 083a 0abb > 0a0: 0002 5915 0419 09bb 0014 59b2 00b8 1909 > 0b0: c000 bab6 00ca b700 cd03 b800 e33a 0e3a > 0c0: 0d03 190d b900 e701 0019 0e3a 1136 1036 > 0d0: 0f15 0f15 109f 0027 150f 0460 1510 190d > 0e0: 150f b900 ea02 00c0 0005 3a17 1911 1917 > 0f0: b800 ee3a 1136 1036 0fa7 ffd8 1911 b800 > 100: f2b7 0060 3a0b bb00 0259 1505 190a bb00 > 110: 1459 b200 b819 0ac0 00ba b600 cab7 00cd > 120: 03b8 00e3 3a13 3a12 0319 12b9 00e7 0100 > 130: 1913 3a16 3615 3614 1514 1515 9f00 2715 > 140: 1404 6015 1519 1215 14b9 00ea 0200 c000 > 150: 053a 1819 1619 18b8 00f5 3a16 3615 3614 > 160: a7ff d819 16b8 00f2 b700 603a 0cb2 00fa > 170: b200 b805 bd00 0259 0319 0b53 5904 190c > 180: 53c0 00ba b600 beb6 00fd b02a b600 3303 > 190: 32b6 00ff b0 > Stackmap Table: > same_frame(@35) > > full_frame(@141,{Object[#2],Integer,Integer,Integer,Integer,Integer,Object[#109]},{}) > append_frame(@151,Object[#129],Object[#129]) > > full_frame(@209,{Object[#2],Integer,Integer,Integer,Integer,Integer,Object[#109],Object[#129],Object[#129],Object[#129],Object[#129],Top,Top,Object[#20],Object[#55],Integer,Integer,Object[#107]},{Uninitialized[#159],Uninitialized[#159],Integer,Object[#129]}) > > full_frame(@252,{Object[#2],Integer,Integer,Integer,Integer,Integer,Object[#109],Object[#129],Object[#129],Object[#129],Object[#129],Top,Top,Object[#20],Object[#55],Integer,Integer,Object[#107]},{Uninitialized[#159],Uninitialized[#159],Integer,Object[#129]}) > > full_frame(@312,{Object[#2],Integer,Integer,Integer,Integer,Integer,Object[#109],Object[#129],Object[#129],Object[#129],Object[#129],Object[#2],Top,Object[#20],Object[#55],Integer,Integer,Object[#107],Object[#20],Object[#55],Integer,Integer,Object[#107]},{Uninitialized[#262],Uninitialized[#262],Integer,Object[#129]}) > > full_frame(@355,{Object[#2],Integer,Integer,Integer,Integer,Integer,Object[#109],Object[#129],Object[#129],Object[#129],Object[#129],Object[#2],Top,Object[#20],Object[#55],Integer,Integer,Object[#107],Object[#20],Object[#55],Integer,Integer,Object[#107]},{Uninitialized[#262],Uninitialized[#262],Integer,Object[#129]}) > full_frame(@395,{Object[#2],Integer},{}){noformat} > This appears to be due to [this JDK > issue|https://stackoverflow.com/questions/24061672/verifyerror-uninitialized-object-exists-on-backward-branch-jvm-spec-4-10-2-4], > which was fixed in JDK 8u25. > And sure enough, here's the JDK version for failing builds: > {noformat} > -- Found Java: /opt/toolchain/sun-jdk-64bit-1.8.0.05/bin/java (found suitable > version "1.8.0.05", minimum required is "1.7") > {noformat} > And here it is for successful builds: > {noformat} > 19:06:12 -- Found Java: /usr/lib/jvm/java-1.8.0-openjdk-amd64/bin/java (found > suitable version "1.8.0.111", minimum required is "1.7") > {noformat} > We either need to blacklist JDK8 versions older than u25, or we need to > condition the scalafmt step on the JDK version. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-3148) Add Java client metrics
[ https://issues.apache.org/jira/browse/KUDU-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-3148: - Assignee: (was: Grant Henke) > Add Java client metrics > --- > > Key: KUDU-3148 > URL: https://issues.apache.org/jira/browse/KUDU-3148 > Project: Kudu > Issue Type: Improvement > Components: client >Affects Versions: 1.12.0 >Reporter: Grant Henke >Priority: Major > Labels: roadmap-candidate, supportability > > This Jira is to track adding complete metrics to the Java client. There are > many cases where applications using the client have issues that are difficult > to debug. The primary reason is that it's hard to reason about what the > application is doing with the Kudu client without inspecting the code, and > even then it can be easy to miss an issue in the code as well. > For example we have seen many cases where an application creates a Kudu > client, sends a few messages, and then closes the client in a loop. Creating > many clients over an over not only impacts performance/stability of the > application but can also put unwelcome load on the servers. If we had request > metrics with a client id tag periodically logged, then it would be easy to > grep the application logs for unique client ids and spot the issue and the > offending application. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-3287) Threads can linger for some time after calling close on the Java KuduClient
[ https://issues.apache.org/jira/browse/KUDU-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-3287: - Assignee: (was: Grant Henke) > Threads can linger for some time after calling close on the Java KuduClient > --- > > Key: KUDU-3287 > URL: https://issues.apache.org/jira/browse/KUDU-3287 > Project: Kudu > Issue Type: Bug >Affects Versions: 1.12.0, 1.13.0, 1.14.0 >Reporter: Grant Henke >Priority: Major > > After the upgrade to Netty 4 in Kudu 1.12 the close/shutdown behavior of the > Java client changed where threads and resources could linger for some time > after the call to close() returned. This looks the be because > `bootstrap.config().group().shutdownGracefully` is called with the default of > 15s and returns asynchronously. Additionally, the default ExecutorService was > not shutdown on close. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-3135) Add Client Metadata Tokens
[ https://issues.apache.org/jira/browse/KUDU-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-3135: - Assignee: (was: Grant Henke) > Add Client Metadata Tokens > -- > > Key: KUDU-3135 > URL: https://issues.apache.org/jira/browse/KUDU-3135 > Project: Kudu > Issue Type: Improvement > Components: client >Affects Versions: 1.12.0 >Reporter: Grant Henke >Priority: Major > Labels: roadmap-candidate, scalability > > Currently when a distributed task is done using the Kudu client, the > driver/coordinator client needs to open the table to request its current > metadata and locations. Then it can distribute the work to tasks/executors on > remote nodes. In the case of reading data, often ScanTokens are used to > distribute the work, and in the case of writing data perhaps just the table > name is required. > The problem is that each parallel task then also needs to open the table to > request the metadata for the table. Using Spark as an example, this happens > when deserializing the scan tokens in KuduRDD > ([here|https://github.com/apache/kudu/blob/master/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala#L107-L108]) > or when writing rows using the KuduContext > ([here|https://github.com/apache/kudu/blob/master/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala#L466]). > This results in a large burst of metadata requests to the leader Kudu master > all at once. Given the Kudu master is only a single server and requests can't > be served from the follower masters, this effectively limits the amount of > parallel tasks that can run in a large Kudu deployment. Even if the follower > masters could service the requests, that still limits scalability in very > large clusters given most deployments would only have 3-5 masters. > Adding a metadata token, similar to a scan token, would be a useful way to > allow the single driver to fetch all the metadata required for the parallel > tasks. The tokens can be serialized and then passed to each task in a similar > fashion to scan tokens. > Of course in a pessimistic case, something may change between generation of > the token and the start of the task. In that case a request would need to be > sent to get the updated metadata. However, that scenario should be rare and > likely would not result in all of the requests happening at the same time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-3245) Provide Client API to set verbose logging filtered by vmodule
[ https://issues.apache.org/jira/browse/KUDU-3245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-3245: - Assignee: (was: Grant Henke) > Provide Client API to set verbose logging filtered by vmodule > -- > > Key: KUDU-3245 > URL: https://issues.apache.org/jira/browse/KUDU-3245 > Project: Kudu > Issue Type: Improvement > Components: client >Reporter: Hao Hao >Priority: Major > > Similar to > [{{client::SetVerboseLogLevel}}|https://github.com/apache/kudu/blob/master/src/kudu/client/client.h#L164] > API, it will be nice to add another API to allow enabling verbose logging > filtered by module. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-2619) Track Java test failures in the flaky test dashboard
[ https://issues.apache.org/jira/browse/KUDU-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-2619: - Assignee: (was: Grant Henke) > Track Java test failures in the flaky test dashboard > > > Key: KUDU-2619 > URL: https://issues.apache.org/jira/browse/KUDU-2619 > Project: Kudu > Issue Type: Improvement > Components: java, test >Affects Versions: n/a >Reporter: Adar Dembo >Priority: Major > > Right now our flaky test tracking infrastructure only incorporates C++ tests > using GTest; we should extend it to include Java tests too. > I spent some time on this recently and I wanted to collect my notes in one > place. > For reference, here's how C++ test reporting works: > # The build-and-test.sh script rebuilds thirdparty dependencies, builds Kudu, > and invokes all test suites, optionally using dist-test. After all tests have > been run, it also collects any dist-test artifacts and logs so that all of > the test results are available in one place. > # The run-test.sh script is effectively the C++ "test runner". It is > responsible for running a test binary, retrying it if it fails, and calling > report-test.sh after each test run (success or failure). Importantly, > report-test.sh is invoked once per test binary (not individual test), and on > test success we don't wait for the script to finish, because we don't care as > much about collecting successes. > # The report-test.sh collects some basic information about the test > environment (such as the git hash used, whether ASAN or TSAN was enabled, > etc.), then uses curl to send the information to the test result server. > # The test result server will store the test run result in a database, and > will query that database to produce a dashboard. > There are several problems to solve if we're going to replicate this for Java: > # There's no equivalent to run-test.sh. The entry point for running the Java > test suite is Gradle, but in dist-test, the test invocation is actually done > via 'java org.junit.runner.JUnitCore'. Note that C++ test reporting is > currently also incompatible with dist-test, so the Java tests aren't unique > in that respect. > # It'd be some work to replace report-test.sh with a Java equivalent. > My thinking is that we should move test reporting from run-test.sh into > build-and-test.sh: > # It's a good separation of concerns. The test "runners" are responsible for > running and maybe retrying tests, while the test "aggregator" > (build-and-test.sh) is responsible for reporting. > # It's more performant. You can imagine building a test_result_server.py > endpoint for reporting en masse, which would cut down on the number of round > trips. That's especially important if we start reporting individual test > results (as opposed to test _suite_ results). > # It means the reporting logic need only be written once. > # It was always a bit unexpected to find reporting logic buried in > run-test.sh. I mean, it made sense for rapid prototyping but it never really > made that much sense to me. > So then the problem is ensuring that, after all tests have run, we have the > right JUnit XML and log files for every test that ran, including retries, > which is more tractable, and doable for dist-test environments too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-2283) Improve KuduPartialRow::ToString() decimal output
[ https://issues.apache.org/jira/browse/KUDU-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-2283: - Assignee: (was: Grant Henke) > Improve KuduPartialRow::ToString() decimal output > - > > Key: KUDU-2283 > URL: https://issues.apache.org/jira/browse/KUDU-2283 > Project: Kudu > Issue Type: Improvement >Affects Versions: 1.7.0 >Reporter: Grant Henke >Priority: Minor > > Currently KuduPartialRow::ToString() uses "AppendDebugStringForValue" to > print decimal values. However we could use the ColumnTypeAttributes to better > "pretty print" the decimal values. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KUDU-2375) Can't parse message of type "kudu.master.SysTablesEntryPB" because it is missing required fields: schema.columns[5].type
[ https://issues.apache.org/jira/browse/KUDU-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-2375: - Assignee: (was: Grant Henke) > Can't parse message of type "kudu.master.SysTablesEntryPB" because it is > missing required fields: schema.columns[5].type > > > Key: KUDU-2375 > URL: https://issues.apache.org/jira/browse/KUDU-2375 > Project: Kudu > Issue Type: Bug > Components: master >Affects Versions: 1.7.0 >Reporter: Michael Brown >Priority: Major > > When tables with decimals are added in 1.7.0, a downgrade from 1.7.0 to 1.6 > results in a dcheck when 1.6 starts and Kudu isn't usable in its downgraded > version. > {noformat} > F0324 17:45:10.681808 105716 catalog_manager.cc:935] Loading table and tablet > metadata into memory failed: Corruption: Failed while visiting tables in sys > catalog: unable to parse metadata field for row > 467d365fffbe4485a3249079c48f42a9: Error parsing msg: Can't parse message of > type "kudu.master.SysTablesEntryPB" because it is missing required fields: > schema.columns[5].type > {noformat} > {noformat} > #0 0x003355e32625 in raise () from /lib64/libc.so.6 > #1 0x003355e33e05 in abort () from /lib64/libc.so.6 > #2 0x01cea129 in ?? () > #3 0x009268cd in google::LogMessage::Fail() () > #4 0x0092878d in google::LogMessage::SendToLog() () > #5 0x00926409 in google::LogMessage::Flush() () > #6 0x0092922f in google::LogMessageFatal::~LogMessageFatal() () > #7 0x008f05de in ?? () > #8 0x008f6039 in > kudu::master::CatalogManager::PrepareForLeadershipTask() () > #9 0x01d297d7 in kudu::ThreadPool::DispatchThread() () > #10 0x01d20151 in kudu::Thread::SuperviseThread(void*) () > #11 0x003356207aa1 in start_thread () from /lib64/libpthread.so.0 > #12 0x003355ee893d in clone () from /lib64/libc.so.6 > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)