[GitHub] [incubator-pinot] kishoreg opened a new pull request #4399: Moving handling of NULL values from RecordReaders to NullValueTransfo…
kishoreg opened a new pull request #4399: Moving handling of NULL values from RecordReaders to NullValueTransfo… URL: https://github.com/apache/incubator-pinot/pull/4399 …rmer Prior to this PR, its the responsibility of each recordReader to populate defaulValues for every field in schema. This PR moves that logic into NullValueTransformer where we set defaultNullValues appropriately. This is a precursor to supporting NULLs in Pinot. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot] branch increase-wait-time updated: Gracefully shutdown helix manager instances created for fake participants
This is an automated email from the ASF dual-hosted git repository. mcvsubbu pushed a commit to branch increase-wait-time in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git The following commit(s) were added to refs/heads/increase-wait-time by this push: new 328875f Gracefully shutdown helix manager instances created for fake participants 328875f is described below commit 328875f731a698f15a524012ca58a0f5d44d43c9 Author: Subbu Subramaniam AuthorDate: Wed Jul 3 15:27:49 2019 -0700 Gracefully shutdown helix manager instances created for fake participants --- .../broker/broker/HelixBrokerStarterTest.java | 8 +- .../helix/ControllerRequestBuilderUtil.java| 105 - ...questBuilderUtil.java => FakeHelixClients.java} | 68 ++--- .../api/resources/PinotFileUploadTest.java | 9 +- .../resources/PinotSegmentRestletResourceTest.java | 9 +- .../resources/PinotTableRestletResourceTest.java | 9 +- .../resources/PinotTenantRestletResourceTest.java | 9 +- .../controller/api/resources/TableViewsTest.java | 9 +- .../helix/ControllerInstanceToggleTest.java| 7 +- .../controller/helix/ControllerSentinelTestV2.java | 7 +- .../controller/helix/ControllerTenantTest.java | 7 +- .../controller/helix/PinotResourceManagerTest.java | 7 +- .../helix/core/PinotHelixResourceManagerTest.java | 9 +- .../ReplicaGroupRebalanceStrategyTest.java | 12 ++- .../sharding/SegmentAssignmentStrategyTest.java| 11 ++- .../validation/ValidationManagerTest.java | 9 +- 16 files changed, 116 insertions(+), 179 deletions(-) diff --git a/pinot-broker/src/test/java/org/apache/pinot/broker/broker/HelixBrokerStarterTest.java b/pinot-broker/src/test/java/org/apache/pinot/broker/broker/HelixBrokerStarterTest.java index 49f5557..958be70 100644 --- a/pinot-broker/src/test/java/org/apache/pinot/broker/broker/HelixBrokerStarterTest.java +++ b/pinot-broker/src/test/java/org/apache/pinot/broker/broker/HelixBrokerStarterTest.java @@ -44,8 +44,8 @@ import org.apache.pinot.common.data.Schema; import org.apache.pinot.common.metadata.segment.OfflineSegmentZKMetadata; import org.apache.pinot.common.utils.CommonConstants; import org.apache.pinot.common.utils.ZkStarter; -import org.apache.pinot.controller.helix.ControllerRequestBuilderUtil; import org.apache.pinot.controller.helix.ControllerTest; +import org.apache.pinot.controller.helix.FakeHelixClients; import org.apache.pinot.controller.utils.SegmentMetadataMockUtils; import org.testng.Assert; import org.testng.annotations.AfterTest; @@ -64,6 +64,7 @@ public class HelixBrokerStarterTest extends ControllerTest { private ZkClient _zkClient; private HelixBrokerStarter _helixBrokerStarter; private ZkStarter.ZookeeperInstance _zookeeperInstance; + private FakeHelixClients _fakeHelixClients; @BeforeTest public void setUp() @@ -78,9 +79,9 @@ public class HelixBrokerStarterTest extends ControllerTest { _helixBrokerStarter = new HelixBrokerStarter(_brokerConf, getHelixClusterName(), ZkStarter.DEFAULT_ZK_STR); _helixBrokerStarter.start(); -ControllerRequestBuilderUtil +_fakeHelixClients .addFakeBrokerInstancesToAutoJoinHelixCluster(getHelixClusterName(), ZkStarter.DEFAULT_ZK_STR, 5, true); -ControllerRequestBuilderUtil +_fakeHelixClients .addFakeDataInstancesToAutoJoinHelixCluster(getHelixClusterName(), ZkStarter.DEFAULT_ZK_STR, 1, true); while (_helixAdmin.getInstancesInClusterWithTag(getHelixClusterName(), "DefaultTenant_OFFLINE").size() == 0 @@ -131,6 +132,7 @@ public class HelixBrokerStarterTest extends ControllerTest { @AfterTest public void tearDown() { +_fakeHelixClients.shutDown(); _helixResourceManager.stop(); _zkClient.close(); ZkStarter.stopLocalZkServer(_zookeeperInstance); diff --git a/pinot-controller/src/main/java/org/apache/pinot/controller/helix/ControllerRequestBuilderUtil.java b/pinot-controller/src/main/java/org/apache/pinot/controller/helix/ControllerRequestBuilderUtil.java index bedc11a..f00f104 100644 --- a/pinot-controller/src/main/java/org/apache/pinot/controller/helix/ControllerRequestBuilderUtil.java +++ b/pinot-controller/src/main/java/org/apache/pinot/controller/helix/ControllerRequestBuilderUtil.java @@ -19,20 +19,8 @@ package org.apache.pinot.controller.helix; import com.fasterxml.jackson.core.JsonProcessingException; -import java.util.HashMap; -import java.util.Map; -import org.apache.helix.HelixManager; -import org.apache.helix.HelixManagerFactory; -import org.apache.helix.InstanceType; -import org.apache.helix.model.HelixConfigScope; -import org.apache.helix.model.builder.HelixConfigScopeBuilder; -import org.apache.helix.participant.StateMachineEngine; -import org.apache.helix.participant.statemachine.StateModelFactory; -import org.apache.pinot.common.config.TableNameBuilder; -import org.apache.pinot.common.config.Tag
[GitHub] [incubator-pinot] kishoreg merged pull request #4398: Add more key/value pairs into LOG2M_TO_SIZE_IN_BYTES in HllSizeUtils
kishoreg merged pull request #4398: Add more key/value pairs into LOG2M_TO_SIZE_IN_BYTES in HllSizeUtils URL: https://github.com/apache/incubator-pinot/pull/4398 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot] branch master updated: Add more key/value pairs into LOG2M_TO_SIZE_IN_BYTES in HllSizeUtils (#4398)
This is an automated email from the ASF dual-hosted git repository. kishoreg pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git The following commit(s) were added to refs/heads/master by this push: new 732a7b9 Add more key/value pairs into LOG2M_TO_SIZE_IN_BYTES in HllSizeUtils (#4398) 732a7b9 is described below commit 732a7b9228cb602f685909460e44a4ef76e77272 Author: Bo Zhang <44179472+bozhang2...@users.noreply.github.com> AuthorDate: Wed Jul 3 15:13:39 2019 -0700 Add more key/value pairs into LOG2M_TO_SIZE_IN_BYTES in HllSizeUtils (#4398) --- .../java/org/apache/pinot/startree/hll/HllSizeUtils.java | 14 +- .../apache/pinot/core/startree/hll/HllFieldSizeTest.java | 2 +- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/pinot-common/src/main/java/org/apache/pinot/startree/hll/HllSizeUtils.java b/pinot-common/src/main/java/org/apache/pinot/startree/hll/HllSizeUtils.java index 065948b..9d0ab4b 100644 --- a/pinot-common/src/main/java/org/apache/pinot/startree/hll/HllSizeUtils.java +++ b/pinot-common/src/main/java/org/apache/pinot/startree/hll/HllSizeUtils.java @@ -28,7 +28,19 @@ import com.google.common.collect.ImmutableBiMap; public class HllSizeUtils { private static final ImmutableBiMap LOG2M_TO_SIZE_IN_BYTES = - ImmutableBiMap.of(5, 32, 6, 52, 7, 96, 8, 180, 9, 352); + ImmutableBiMap.builder() + .put(5, 32) + .put(6, 52) + .put(7, 96) + .put(8, 180) + .put(9, 352) + .put(10, 692) + .put(11, 1376) + .put(12, 2740) + .put(13, 5472) + .put(14, 10932) + .put(15, 21856) + .build(); public static ImmutableBiMap getLog2mToSizeInBytes() { return LOG2M_TO_SIZE_IN_BYTES; diff --git a/pinot-core/src/test/java/org/apache/pinot/core/startree/hll/HllFieldSizeTest.java b/pinot-core/src/test/java/org/apache/pinot/core/startree/hll/HllFieldSizeTest.java index c2b5a6e..6bfcef8 100644 --- a/pinot-core/src/test/java/org/apache/pinot/core/startree/hll/HllFieldSizeTest.java +++ b/pinot-core/src/test/java/org/apache/pinot/core/startree/hll/HllFieldSizeTest.java @@ -34,7 +34,7 @@ public class HllFieldSizeTest { @Test public void testHllFieldSerializedSize() throws Exception { -for (int i = 5; i < 10; i++) { +for (int i = 5; i < 16; i++) { HyperLogLog hll = new HyperLogLog(i); Assert.assertEquals(HllSizeUtils.getHllFieldSizeFromLog2m(i), hll.getBytes().length); LOGGER.info("Estimated: " + hll.cardinality()); - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] codecov-io commented on issue #4398: Add more key/value pairs into LOG2M_TO_SIZE_IN_BYTES in HllSizeUtils
codecov-io commented on issue #4398: Add more key/value pairs into LOG2M_TO_SIZE_IN_BYTES in HllSizeUtils URL: https://github.com/apache/incubator-pinot/pull/4398#issuecomment-508272173 # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/4398?src=pr&el=h1) Report > Merging [#4398](https://codecov.io/gh/apache/incubator-pinot/pull/4398?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/33f583b2a84a8a22143a26c9cc6b80dc2a81563c?src=pr&el=desc) will **increase** coverage by `9.59%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/4398/graphs/tree.svg?width=650&token=4ibza2ugkz&height=150&src=pr)](https://codecov.io/gh/apache/incubator-pinot/pull/4398?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master #4398 +/- ## === + Coverage 55.7% 65.3% +9.59% Complexity 20 20 === Files 10661066 Lines 55403 55415 +12 Branches 78947894 === + Hits 30865 36189+5324 + Misses22150 16656-5494 - Partials 23882570 +182 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/4398?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...va/org/apache/pinot/startree/hll/HllSizeUtils.java](https://codecov.io/gh/apache/incubator-pinot/pull/4398/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zdGFydHJlZS9obGwvSGxsU2l6ZVV0aWxzLmphdmE=) | `80.95% <100%> (+25.39%)` | `0 <0> (ø)` | :arrow_down: | | [...pache/pinot/core/util/SortedRangeIntersection.java](https://codecov.io/gh/apache/incubator-pinot/pull/4398/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS91dGlsL1NvcnRlZFJhbmdlSW50ZXJzZWN0aW9uLmphdmE=) | `83.82% <0%> (-7.36%)` | `0% <0%> (ø)` | | | [...regation/function/customobject/QuantileDigest.java](https://codecov.io/gh/apache/incubator-pinot/pull/4398/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9hZ2dyZWdhdGlvbi9mdW5jdGlvbi9jdXN0b21vYmplY3QvUXVhbnRpbGVEaWdlc3QuamF2YQ==) | `57.74% <0%> (-0.45%)` | `0% <0%> (ø)` | | | [...ator/transform/function/BaseTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/4398/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQmFzZVRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `29.95% <0%> (+0.42%)` | `0% <0%> (ø)` | :arrow_down: | | [...g/apache/pinot/common/utils/helix/HelixHelper.java](https://codecov.io/gh/apache/incubator-pinot/pull/4398/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvaGVsaXgvSGVsaXhIZWxwZXIuamF2YQ==) | `56.25% <0%> (+0.56%)` | `0% <0%> (ø)` | :arrow_down: | | [...ment/creator/impl/SegmentColumnarIndexCreator.java](https://codecov.io/gh/apache/incubator-pinot/pull/4398/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2NyZWF0b3IvaW1wbC9TZWdtZW50Q29sdW1uYXJJbmRleENyZWF0b3IuamF2YQ==) | `87.45% <0%> (+0.76%)` | `0% <0%> (ø)` | :arrow_down: | | [...r/transform/function/ValueInTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/4398/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vVmFsdWVJblRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `39.2% <0%> (+0.8%)` | `0% <0%> (ø)` | :arrow_down: | | [.../helix/core/realtime/SegmentCompletionManager.java](https://codecov.io/gh/apache/incubator-pinot/pull/4398/diff?src=pr&el=tree#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL3JlYWx0aW1lL1NlZ21lbnRDb21wbGV0aW9uTWFuYWdlci5qYXZh) | `70.39% <0%> (+0.87%)` | `0% <0%> (ø)` | :arrow_down: | | [...e/io/writer/impl/MutableOffHeapByteArrayStore.java](https://codecov.io/gh/apache/incubator-pinot/pull/4398/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9pby93cml0ZXIvaW1wbC9NdXRhYmxlT2ZmSGVhcEJ5dGVBcnJheVN0b3JlLmphdmE=) | `86.45% <0%> (+1.04%)` | `0% <0%> (ø)` | :arrow_down: | | [.../pinot/core/segment/index/SegmentMetadataImpl.java](https://codecov.io/gh/apache/incubator-pinot/pull/4398/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2luZGV4L1NlZ21lbnRNZXRhZGF0YUltcGwuamF2YQ==) | `80.81% <0%> (+1.22%)` | `0% <0%> (ø)` | :arrow_down: | | ... and [321 more](https://codecov.io/gh/apache/incubator-pinot/pull/4398/diff?src=pr&el=tree-more) | | -- [Continue to revie
[GitHub] [incubator-pinot] bozhang2820 commented on issue #4398: Add more key/value pairs into LOG2M_TO_SIZE_IN_BYTES in HllSizeUtils
bozhang2820 commented on issue #4398: Add more key/value pairs into LOG2M_TO_SIZE_IN_BYTES in HllSizeUtils URL: https://github.com/apache/incubator-pinot/pull/4398#issuecomment-508261303 > This is ok but what are we gaining by this change. Is the accuracy changing dramatically Yes without this change the best standard error (1.04 / sqrt(m)) is at ~4.6% when log2m = 9. With this change it goes down to ~0.57% (when log2m = 15). In fact at Uber we have a table configured with log2m = 13 and we have a similar patch to accommodate that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] bozhang2820 opened a new pull request #4398: Add more key/value pairs into LOG2M_TO_SIZE_IN_BYTES in HllSizeUtils
bozhang2820 opened a new pull request #4398: Add more key/value pairs into LOG2M_TO_SIZE_IN_BYTES in HllSizeUtils URL: https://github.com/apache/incubator-pinot/pull/4398 This change would allow more options for log2m in HLL config. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] kishoreg commented on a change in pull request #4321: #4317 Feature/variable length bytes offline dictionary for indexing bytes and string dicts.
kishoreg commented on a change in pull request #4321: #4317 Feature/variable length bytes offline dictionary for indexing bytes and string dicts. URL: https://github.com/apache/incubator-pinot/pull/4321#discussion_r300060734 ## File path: pinot-core/src/main/java/org/apache/pinot/core/realtime/converter/RealtimeSegmentConverter.java ## @@ -54,10 +54,12 @@ private List invertedIndexColumns; private List noDictionaryColumns; private StarTreeIndexSpec starTreeIndexSpec; + private List varLengthDictionaryColumns; public RealtimeSegmentConverter(MutableSegmentImpl realtimeSegment, String outputPath, Schema schema, String tableName, String timeColumnName, String segmentName, String sortedColumn, - List invertedIndexColumns, List noDictionaryColumns, StarTreeIndexSpec starTreeIndexSpec) { + List invertedIndexColumns, List noDictionaryColumns, + List varLengthDictionaryColumns, StarTreeIndexSpec starTreeIndexSpec) { Review comment: This is outside the scope of this PR. Dont worry about it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] buchireddy commented on a change in pull request #4321: #4317 Feature/variable length bytes offline dictionary for indexing bytes and string dicts.
buchireddy commented on a change in pull request #4321: #4317 Feature/variable length bytes offline dictionary for indexing bytes and string dicts. URL: https://github.com/apache/incubator-pinot/pull/4321#discussion_r300053802 ## File path: pinot-core/src/main/java/org/apache/pinot/core/io/util/VarLengthBytesValueReaderWriter.java ## @@ -0,0 +1,196 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.pinot.core.io.util; + +import com.google.common.base.Preconditions; +import java.io.Closeable; +import java.io.IOException; +import java.util.Arrays; +import org.apache.pinot.common.utils.StringUtil; +import org.apache.pinot.core.segment.memory.PinotDataBuffer; + +/** + * Implementation of {@link ValueReader} that will allow each value to be of variable + * length and there by avoiding the unnecessary padding. + * + * The layout of the file is as follows: + * Header Section: + * + *Magic header byte sequence. + * + * + * Values: + * + *Integer offsets to start position of byte arrays. + *All byte arrays. + * + * + * Only sequential writes are supported. + * + * @see FixedByteValueReaderWriter + */ +public class VarLengthBytesValueReaderWriter implements Closeable, ValueReader { + + /** + * Header used to identify the dictionary files written in variable length bytes + * format. Keeping this a mix of alphanumeric with special character to avoid + * collisions with the regular int/string dictionary values written in fixed size + * format. + */ + private static final byte[] MAGIC_HEADER = StringUtil.encodeUtf8("vl1;"); + private static final int MAGIC_HEADER_LENGTH = MAGIC_HEADER.length; + private static final int INDEX_ARRAY_START_OFFSET = MAGIC_HEADER.length + Integer.BYTES; + + private final PinotDataBuffer _dataBuffer; + private transient int _numElements; + + public VarLengthBytesValueReaderWriter(PinotDataBuffer dataBuffer) { +this._dataBuffer = dataBuffer; + } + + public static long getRequiredSize(byte[][] byteArrays) { +// First include the magic header, length field and offsets array. +long length = INDEX_ARRAY_START_OFFSET + Integer.BYTES * (byteArrays.length); + +for (byte[] array: byteArrays) { + length += array.length; +} +return length; + } + + public static boolean isVarLengthBytesDictBuffer(PinotDataBuffer buffer) { +// If the buffer is smaller than header + numElements size, it's not var length dictionary. +if (buffer.size() > INDEX_ARRAY_START_OFFSET) { + byte[] header = new byte[MAGIC_HEADER_LENGTH]; + buffer.copyTo(0, header, 0, MAGIC_HEADER_LENGTH); + + if (Arrays.equals(MAGIC_HEADER, header)) { +// Also verify that there is a valid numElements value. +int numElements = buffer.getInt(MAGIC_HEADER_LENGTH); +return numElements > 0; + } +} + +return false; + } + + private void writeHeader() { +for (int offset = 0; offset < MAGIC_HEADER_LENGTH; offset++) { + _dataBuffer.putByte(offset, MAGIC_HEADER[offset]); +} + } + + public void init(byte[][] byteArrays) { +this._numElements = byteArrays.length; + +writeHeader(); + +// Add the number of elements as the first field in the buffer. +_dataBuffer.putInt(MAGIC_HEADER_LENGTH, byteArrays.length); + +// Then write the offset of each of the byte array in the data buffer. +int nextOffset = INDEX_ARRAY_START_OFFSET; +int nextArrayStartOffset = INDEX_ARRAY_START_OFFSET + Integer.BYTES * byteArrays.length; +for (byte[] array: byteArrays) { + _dataBuffer.putInt(nextOffset, nextArrayStartOffset); + nextOffset += Integer.BYTES; + nextArrayStartOffset += array.length; +} + +// Finally write the byte arrays. +nextArrayStartOffset = INDEX_ARRAY_START_OFFSET + Integer.BYTES * byteArrays.length; +for (byte[] array: byteArrays) { + _dataBuffer.readFrom(nextArrayStartOffset, array); + nextArrayStartOffset += array.length; +} + } + + int getNumElements() { +// Lazily initialize the numElements. +if (_numElements == 0) { + _numElements = _dataBuffer.getInt(MAGIC_HEADER_LENGTH); + Preconditions.checkArgumen
[GitHub] [incubator-pinot] buchireddy commented on a change in pull request #4321: #4317 Feature/variable length bytes offline dictionary for indexing bytes and string dicts.
buchireddy commented on a change in pull request #4321: #4317 Feature/variable length bytes offline dictionary for indexing bytes and string dicts. URL: https://github.com/apache/incubator-pinot/pull/4321#discussion_r300053298 ## File path: pinot-core/src/main/java/org/apache/pinot/core/realtime/converter/RealtimeSegmentConverter.java ## @@ -54,10 +54,12 @@ private List invertedIndexColumns; private List noDictionaryColumns; private StarTreeIndexSpec starTreeIndexSpec; + private List varLengthDictionaryColumns; public RealtimeSegmentConverter(MutableSegmentImpl realtimeSegment, String outputPath, Schema schema, String tableName, String timeColumnName, String segmentName, String sortedColumn, - List invertedIndexColumns, List noDictionaryColumns, StarTreeIndexSpec starTreeIndexSpec) { + List invertedIndexColumns, List noDictionaryColumns, + List varLengthDictionaryColumns, StarTreeIndexSpec starTreeIndexSpec) { Review comment: I've not spent enough time on this part of code to give my take on what's the best thing to do. If one of you can please log an issue with more details, I'll take it up next, check the code and can make the changes finally. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] codecov-io commented on issue #4388: Adding support for Map type fields
codecov-io commented on issue #4388: Adding support for Map type fields URL: https://github.com/apache/incubator-pinot/pull/4388#issuecomment-508156824 # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/4388?src=pr&el=h1) Report > Merging [#4388](https://codecov.io/gh/apache/incubator-pinot/pull/4388?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/33f583b2a84a8a22143a26c9cc6b80dc2a81563c?src=pr&el=desc) will **increase** coverage by `9.75%`. > The diff coverage is `89.79%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/4388/graphs/tree.svg?width=650&token=4ibza2ugkz&height=150&src=pr)](https://codecov.io/gh/apache/incubator-pinot/pull/4388?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master#4388 +/- ## + Coverage 55.7% 65.46% +9.75% Complexity 20 20 Files 1066 1067 +1 Lines 5540355448 +45 Branches 7894 7902 +8 + Hits 3086536300+5435 + Misses2215016570-5580 - Partials 2388 2578 +190 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/4388?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...segment/creator/impl/SegmentDictionaryCreator.java](https://codecov.io/gh/apache/incubator-pinot/pull/4388/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2NyZWF0b3IvaW1wbC9TZWdtZW50RGljdGlvbmFyeUNyZWF0b3IuamF2YQ==) | `87.75% <ø> (+3.97%)` | `0 <0> (ø)` | :arrow_down: | | [...realtime/stream/AvroRecordToPinotRowGenerator.java](https://codecov.io/gh/apache/incubator-pinot/pull/4388/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9yZWFsdGltZS9zdHJlYW0vQXZyb1JlY29yZFRvUGlub3RSb3dHZW5lcmF0b3IuamF2YQ==) | `100% <100%> (+7.69%)` | `0 <0> (ø)` | :arrow_down: | | [...r/transform/function/TransformFunctionFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/4388/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vVHJhbnNmb3JtRnVuY3Rpb25GYWN0b3J5LmphdmE=) | `69.09% <100%> (+6.12%)` | `0 <0> (ø)` | :arrow_down: | | [...ache/pinot/core/data/readers/AvroRecordReader.java](https://codecov.io/gh/apache/incubator-pinot/pull/4388/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL3JlYWRlcnMvQXZyb1JlY29yZFJlYWRlci5qYXZh) | `86.2% <100%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [.../transform/function/MapValueTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/4388/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vTWFwVmFsdWVUcmFuc2Zvcm1GdW5jdGlvbi5qYXZh) | `86.95% <86.95%> (ø)` | `0 <0> (?)` | | | [...ain/java/org/apache/pinot/core/util/AvroUtils.java](https://codecov.io/gh/apache/incubator-pinot/pull/4388/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS91dGlsL0F2cm9VdGlscy5qYXZh) | `53.33% <91.3%> (+7.79%)` | `0 <0> (ø)` | :arrow_down: | | [...ator/transform/function/BaseTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/4388/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQmFzZVRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `29.95% <0%> (+0.42%)` | `0% <0%> (ø)` | :arrow_down: | | [...ment/creator/impl/SegmentColumnarIndexCreator.java](https://codecov.io/gh/apache/incubator-pinot/pull/4388/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2NyZWF0b3IvaW1wbC9TZWdtZW50Q29sdW1uYXJJbmRleENyZWF0b3IuamF2YQ==) | `87.45% <0%> (+0.76%)` | `0% <0%> (ø)` | :arrow_down: | | [...r/transform/function/ValueInTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/4388/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vVmFsdWVJblRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `39.2% <0%> (+0.8%)` | `0% <0%> (ø)` | :arrow_down: | | ... and [323 more](https://codecov.io/gh/apache/incubator-pinot/pull/4388/diff?src=pr&el=tree-more) | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/4388?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-pin
[GitHub] [incubator-pinot] kishoreg commented on a change in pull request #4321: #4317 Feature/variable length bytes offline dictionary for indexing bytes and string dicts.
kishoreg commented on a change in pull request #4321: #4317 Feature/variable length bytes offline dictionary for indexing bytes and string dicts. URL: https://github.com/apache/incubator-pinot/pull/4321#discussion_r300023548 ## File path: pinot-core/src/main/java/org/apache/pinot/core/realtime/converter/RealtimeSegmentConverter.java ## @@ -54,10 +54,12 @@ private List invertedIndexColumns; private List noDictionaryColumns; private StarTreeIndexSpec starTreeIndexSpec; + private List varLengthDictionaryColumns; public RealtimeSegmentConverter(MutableSegmentImpl realtimeSegment, String outputPath, Schema schema, String tableName, String timeColumnName, String segmentName, String sortedColumn, - List invertedIndexColumns, List noDictionaryColumns, StarTreeIndexSpec starTreeIndexSpec) { + List invertedIndexColumns, List noDictionaryColumns, + List varLengthDictionaryColumns, StarTreeIndexSpec starTreeIndexSpec) { Review comment: Instead, we should deprecate segmentGenerationConfig and use TableConfig everywhere. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] codecov-io commented on issue #4397: Adding Support for Kafka 2.0 Consumer
codecov-io commented on issue #4397: Adding Support for Kafka 2.0 Consumer URL: https://github.com/apache/incubator-pinot/pull/4397#issuecomment-508060135 # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/4397?src=pr&el=h1) Report > Merging [#4397](https://codecov.io/gh/apache/incubator-pinot/pull/4397?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/33f583b2a84a8a22143a26c9cc6b80dc2a81563c?src=pr&el=desc) will **increase** coverage by `9.66%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/4397/graphs/tree.svg?width=650&token=4ibza2ugkz&height=150&src=pr)](https://codecov.io/gh/apache/incubator-pinot/pull/4397?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master#4397 +/- ## + Coverage 55.7% 65.37% +9.66% Complexity 20 20 Files 1066 1066 Lines 5540355403 Branches 7894 7894 + Hits 3086536217+5352 + Misses2215016626-5524 - Partials 2388 2560 +172 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/4397?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...he/pinot/controller/util/SegmentIntervalUtils.java](https://codecov.io/gh/apache/incubator-pinot/pull/4397/diff?src=pr&el=tree#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci91dGlsL1NlZ21lbnRJbnRlcnZhbFV0aWxzLmphdmE=) | `0% <0%> (ø)` | `0% <0%> (ø)` | :arrow_down: | | [...ator/transform/function/BaseTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/4397/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQmFzZVRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `29.95% <0%> (+0.42%)` | `0% <0%> (ø)` | :arrow_down: | | [...g/apache/pinot/common/utils/helix/HelixHelper.java](https://codecov.io/gh/apache/incubator-pinot/pull/4397/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvaGVsaXgvSGVsaXhIZWxwZXIuamF2YQ==) | `56.25% <0%> (+0.56%)` | `0% <0%> (ø)` | :arrow_down: | | [...ment/creator/impl/SegmentColumnarIndexCreator.java](https://codecov.io/gh/apache/incubator-pinot/pull/4397/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2NyZWF0b3IvaW1wbC9TZWdtZW50Q29sdW1uYXJJbmRleENyZWF0b3IuamF2YQ==) | `87.45% <0%> (+0.76%)` | `0% <0%> (ø)` | :arrow_down: | | [...r/transform/function/ValueInTransformFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/4397/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vVmFsdWVJblRyYW5zZm9ybUZ1bmN0aW9uLmphdmE=) | `39.2% <0%> (+0.8%)` | `0% <0%> (ø)` | :arrow_down: | | [.../helix/core/realtime/SegmentCompletionManager.java](https://codecov.io/gh/apache/incubator-pinot/pull/4397/diff?src=pr&el=tree#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL3JlYWx0aW1lL1NlZ21lbnRDb21wbGV0aW9uTWFuYWdlci5qYXZh) | `70.39% <0%> (+0.87%)` | `0% <0%> (ø)` | :arrow_down: | | [...e/io/writer/impl/MutableOffHeapByteArrayStore.java](https://codecov.io/gh/apache/incubator-pinot/pull/4397/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9pby93cml0ZXIvaW1wbC9NdXRhYmxlT2ZmSGVhcEJ5dGVBcnJheVN0b3JlLmphdmE=) | `86.45% <0%> (+1.04%)` | `0% <0%> (ø)` | :arrow_down: | | [.../pinot/core/segment/index/SegmentMetadataImpl.java](https://codecov.io/gh/apache/incubator-pinot/pull/4397/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2luZGV4L1NlZ21lbnRNZXRhZGF0YUltcGwuamF2YQ==) | `80.81% <0%> (+1.22%)` | `0% <0%> (ø)` | :arrow_down: | | [...va/org/apache/pinot/common/data/TimeFieldSpec.java](https://codecov.io/gh/apache/incubator-pinot/pull/4397/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZGF0YS9UaW1lRmllbGRTcGVjLmphdmE=) | `93.82% <0%> (+1.23%)` | `0% <0%> (ø)` | :arrow_down: | | [.../org/apache/pinot/pql/parsers/Pql2AstListener.java](https://codecov.io/gh/apache/incubator-pinot/pull/4397/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9wcWwvcGFyc2Vycy9QcWwyQXN0TGlzdGVuZXIuamF2YQ==) | `89.3% <0%> (+1.25%)` | `0% <0%> (ø)` | :arrow_down: | | ... and [320 more](https://codecov.io/gh/apache/incubator-pinot/pull/4397/diff?src=pr&el=tree-more) | | -- [Continue to review full report at Codecov](https:
[GitHub] [incubator-pinot] fx19880617 opened a new pull request #4397: Adding Support for Kafka 2.0 Consumer
fx19880617 opened a new pull request #4397: Adding Support for Kafka 2.0 Consumer URL: https://github.com/apache/incubator-pinot/pull/4397 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot] 01/02: WIP: adding kafka 2 stream provider
This is an automated email from the ASF dual-hosted git repository. xiangfu pushed a commit to branch kafka_2.0 in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git commit 3f682d0c454263a8ab5361d751f9f5c9b30e715a Author: Ananth Packkildurai AuthorDate: Wed Jun 12 18:31:22 2019 -0700 WIP: adding kafka 2 stream provider --- .../pinot-connector-kafka-2.0/README.md| 24 pinot-connectors/pinot-connector-kafka-2.0/pom.xml | 67 ++ .../impl/kafka2/KafkaConnectionHandler.java| 61 + .../realtime/impl/kafka2/KafkaConsumerFactory.java | 49 +++ .../realtime/impl/kafka2/KafkaMessageBatch.java| 65 ++ .../impl/kafka2/KafkaPartitionConsumer.java| 51 .../kafka2/KafkaPartitionLevelStreamConfig.java| 144 + .../impl/kafka2/KafkaStreamConfigProperties.java | 65 ++ .../impl/kafka2/KafkaStreamMetadataProvider.java | 81 .../realtime/impl/kafka2/MessageAndOffset.java | 49 +++ pinot-connectors/pom.xml | 2 + 11 files changed, 658 insertions(+) diff --git a/pinot-connectors/pinot-connector-kafka-2.0/README.md b/pinot-connectors/pinot-connector-kafka-2.0/README.md new file mode 100644 index 000..cc1950c --- /dev/null +++ b/pinot-connectors/pinot-connector-kafka-2.0/README.md @@ -0,0 +1,24 @@ + +# Pinot connector for kafka 2.0.x + +This is an implementation of the kafka stream for kafka versions 2.0.x The version used in this implementation is kafka 2.0.0. This module compiles with version 2.0.0 as well, however we have not tested if it runs with the older versions. +A stream plugin for another version of kafka, or another stream, can be added in a similar fashion. Refer to documentation on (Pluggable Streams)[https://pinot.readthedocs.io/en/latest/pluggable_streams.html] for the specfic interfaces to implement. diff --git a/pinot-connectors/pinot-connector-kafka-2.0/pom.xml b/pinot-connectors/pinot-connector-kafka-2.0/pom.xml new file mode 100644 index 000..f351219 --- /dev/null +++ b/pinot-connectors/pinot-connector-kafka-2.0/pom.xml @@ -0,0 +1,67 @@ + + +http://maven.apache.org/POM/4.0.0"; + xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; + xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd";> + +pinot-connectors +org.apache.pinot +0.2.0-SNAPSHOT +.. + +4.0.0 + +pinot-connector-kafka-2.0 + + +${basedir}/../.. +2.0.0 + + + + + + +org.apache.kafka +kafka-clients +${kafka.version} + + +org.slf4j +slf4j-log4j12 + + +net.sf.jopt-simple +jopt-simple + + +org.scala-lang +scala-library + + +org.apache.zookeeper +zookeeper + + + + + \ No newline at end of file diff --git a/pinot-connectors/pinot-connector-kafka-2.0/src/main/java/org/apache/pinot/core/realtime/impl/kafka2/KafkaConnectionHandler.java b/pinot-connectors/pinot-connector-kafka-2.0/src/main/java/org/apache/pinot/core/realtime/impl/kafka2/KafkaConnectionHandler.java new file mode 100644 index 000..802062f --- /dev/null +++ b/pinot-connectors/pinot-connector-kafka-2.0/src/main/java/org/apache/pinot/core/realtime/impl/kafka2/KafkaConnectionHandler.java @@ -0,0 +1,61 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.pinot.core.realtime.impl.kafka2; + +import org.apache.kafka.clients.consumer.Consumer; +import org.apache.kafka.clients.consumer.ConsumerConfig; +import org.apache.kafka.clients.consumer.KafkaConsumer; +import org.apache.kafka.common.TopicPartition; +import org.apache.kafka.common.serialization.BytesDeserializer; +import org.apache.kafka.common.serialization.StringDeserializer; +import org.apache.pinot.core.realtime.stream.StreamConfig; + +import java.
[incubator-pinot] branch kafka_2.0 created (now d9e0316)
This is an automated email from the ASF dual-hosted git repository. xiangfu pushed a change to branch kafka_2.0 in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git. at d9e0316 Adding support for Kafka 2.0 This branch includes the following new commits: new 3f682d0 WIP: adding kafka 2 stream provider new d9e0316 Adding support for Kafka 2.0 The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot] 02/02: Adding support for Kafka 2.0
This is an automated email from the ASF dual-hosted git repository. xiangfu pushed a commit to branch kafka_2.0 in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git commit d9e031618c4c7fa28e64d62858aa7e4a36d6f279 Author: Xiang Fu AuthorDate: Mon Jul 1 16:17:25 2019 -0700 Adding support for Kafka 2.0 --- pinot-connectors/pinot-connector-kafka-0.9/pom.xml | 5 + pinot-connectors/pinot-connector-kafka-2.0/pom.xml | 112 +--- ...umerFactory.java => Kafka2ConsumerFactory.java} | 36 +-- .../impl/kafka2/Kafka2ConsumerManager.java | 191 ++ .../impl/kafka2/Kafka2HighLevelStreamConfig.java | 135 ++ .../realtime/impl/kafka2/Kafka2MessageBatch.java | 61 + .../Kafka2PartitionLevelConnectionHandler.java | 67 + ...Kafka2PartitionLevelPartitionLevelConsumer.java | 65 + .../kafka2/Kafka2PartitionLevelStreamConfig.java | 146 +++ ...Kafka2PartitionLevelStreamMetadataProvider.java | 67 + ...ties.java => Kafka2StreamConfigProperties.java} | 32 +-- .../impl/kafka2/Kafka2StreamLevelConsumer.java | 166 .../impl/kafka2/KafkaAvroMessageDecoder.java | 290 + .../impl/kafka2/KafkaConnectionHandler.java| 61 - .../impl/kafka2/KafkaJSONMessageDecoder.java | 63 + .../realtime/impl/kafka2/KafkaMessageBatch.java| 65 - .../impl/kafka2/KafkaPartitionConsumer.java| 51 .../kafka2/KafkaPartitionLevelStreamConfig.java| 144 -- .../impl/kafka2/KafkaStreamMetadataProvider.java | 81 -- .../realtime/impl/kafka2/MessageAndOffset.java | 42 +-- .../kafka2/KafkaPartitionLevelConsumerTest.java| 232 + .../KafkaPartitionLevelStreamConfigTest.java | 161 .../impl/kafka2/utils/EmbeddedZooKeeper.java | 60 + .../impl/kafka2/utils/MiniKafkaCluster.java| 175 + pinot-connectors/pom.xml | 12 + 25 files changed, 2024 insertions(+), 496 deletions(-) diff --git a/pinot-connectors/pinot-connector-kafka-0.9/pom.xml b/pinot-connectors/pinot-connector-kafka-0.9/pom.xml index ae0317e..852c29c 100644 --- a/pinot-connectors/pinot-connector-kafka-0.9/pom.xml +++ b/pinot-connectors/pinot-connector-kafka-0.9/pom.xml @@ -63,5 +63,10 @@ + + org.scala-lang + scala-library + 2.10.5 + diff --git a/pinot-connectors/pinot-connector-kafka-2.0/pom.xml b/pinot-connectors/pinot-connector-kafka-2.0/pom.xml index f351219..2a9c155 100644 --- a/pinot-connectors/pinot-connector-kafka-2.0/pom.xml +++ b/pinot-connectors/pinot-connector-kafka-2.0/pom.xml @@ -22,46 +22,82 @@ http://maven.apache.org/POM/4.0.0"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd";> - -pinot-connectors -org.apache.pinot -0.2.0-SNAPSHOT -.. - -4.0.0 + +pinot-connectors +org.apache.pinot +0.2.0-SNAPSHOT +.. + + 4.0.0 + pinot-connector-kafka-2.0 + Pinot Connector Kafka 2.0 + https://pinot.apache.org/ + +${basedir}/../.. +2.0.0 + -pinot-connector-kafka-2.0 + - -${basedir}/../.. -2.0.0 - + + + org.apache.kafka + kafka-clients + ${kafka.version} + + + org.slf4j + slf4j-log4j12 + + + net.sf.jopt-simple + jopt-simple + + + org.scala-lang + scala-library + + + org.apache.zookeeper + zookeeper + + + - + + org.apache.kafka + kafka_2.12 + ${kafka.version} + + + org.slf4j + slf4j-log4j12 + + + net.sf.jopt-simple + jopt-simple + + + org.scala-lang + scala-library + + + org.scala-lang + scala-reflect + + + org.apache.zookeeper + zookeeper + + + test + - - -org.apache.kafka -kafka-clients -${kafka.version} - - -org.slf4j -slf4j-log4j12 - - -net.sf.jopt-simple -jopt-simple - - -org.scala-lang -scala-library - - -org.apache.zookeeper -zookeeper - - - - + + org.scala-lang + scala-library + 2.12.8 + test + + \ No newline at end of file diff --git a/pinot-connectors/pinot-connector-kafka-2.0/src/main/java/org/apache/pinot/core/realtime/i
[GitHub] [incubator-pinot] codecov-io commented on issue #4321: #4317 Feature/variable length bytes offline dictionary for indexing bytes and string dicts.
codecov-io commented on issue #4321: #4317 Feature/variable length bytes offline dictionary for indexing bytes and string dicts. URL: https://github.com/apache/incubator-pinot/pull/4321#issuecomment-507968394 # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/4321?src=pr&el=h1) Report > Merging [#4321](https://codecov.io/gh/apache/incubator-pinot/pull/4321?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/2e543e992f842918e2405c276b041d03b0958bf6?src=pr&el=desc) will **increase** coverage by `8.99%`. > The diff coverage is `69.92%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/4321/graphs/tree.svg?width=650&token=4ibza2ugkz&height=150&src=pr)](https://codecov.io/gh/apache/incubator-pinot/pull/4321?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master#4321 +/- ## + Coverage 56.23% 65.22% +8.99% Complexity 20 20 Files 1061 1067 +6 Lines 5498055507 +527 Branches 7824 7906 +82 + Hits 3091636203+5287 + Misses2167716737-4940 - Partials 2387 2567 +180 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/4321?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...ot/core/segment/store/SegmentLocalFSDirectory.java](https://codecov.io/gh/apache/incubator-pinot/pull/4321/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L3N0b3JlL1NlZ21lbnRMb2NhbEZTRGlyZWN0b3J5LmphdmE=) | `65.73% <0%> (+1.68%)` | `0 <0> (ø)` | :arrow_down: | | [.../core/segment/creator/ColumnIndexCreationInfo.java](https://codecov.io/gh/apache/incubator-pinot/pull/4321/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2NyZWF0b3IvQ29sdW1uSW5kZXhDcmVhdGlvbkluZm8uamF2YQ==) | `93.54% <100%> (+7.34%)` | `0 <0> (ø)` | :arrow_down: | | [...manager/realtime/HLRealtimeSegmentDataManager.java](https://codecov.io/gh/apache/incubator-pinot/pull/4321/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvcmVhbHRpbWUvSExSZWFsdGltZVNlZ21lbnREYXRhTWFuYWdlci5qYXZh) | `81.21% <100%> (+81.21%)` | `0 <0> (ø)` | :arrow_down: | | [...ment/creator/impl/SegmentColumnarIndexCreator.java](https://codecov.io/gh/apache/incubator-pinot/pull/4321/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2NyZWF0b3IvaW1wbC9TZWdtZW50Q29sdW1uYXJJbmRleENyZWF0b3IuamF2YQ==) | `87.5% <100%> (+0.8%)` | `0 <0> (ø)` | :arrow_down: | | [...t/creator/impl/SegmentIndexCreationDriverImpl.java](https://codecov.io/gh/apache/incubator-pinot/pull/4321/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2NyZWF0b3IvaW1wbC9TZWdtZW50SW5kZXhDcmVhdGlvbkRyaXZlckltcGwuamF2YQ==) | `89.71% <100%> (+1.74%)` | `0 <0> (ø)` | :arrow_down: | | [...loader/defaultcolumn/BaseDefaultColumnHandler.java](https://codecov.io/gh/apache/incubator-pinot/pull/4321/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2luZGV4L2xvYWRlci9kZWZhdWx0Y29sdW1uL0Jhc2VEZWZhdWx0Q29sdW1uSGFuZGxlci5qYXZh) | `91.89% <100%> (+6.75%)` | `0 <0> (ø)` | :arrow_down: | | [...manager/realtime/LLRealtimeSegmentDataManager.java](https://codecov.io/gh/apache/incubator-pinot/pull/4321/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvcmVhbHRpbWUvTExSZWFsdGltZVNlZ21lbnREYXRhTWFuYWdlci5qYXZh) | `71.65% <100%> (+28.13%)` | `0 <0> (ø)` | :arrow_down: | | [...gment/index/readers/ImmutableDictionaryReader.java](https://codecov.io/gh/apache/incubator-pinot/pull/4321/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2luZGV4L3JlYWRlcnMvSW1tdXRhYmxlRGljdGlvbmFyeVJlYWRlci5qYXZh) | `90.47% <33.33%> (-3.79%)` | `0 <0> (ø)` | | | [...org/apache/pinot/common/config/IndexingConfig.java](https://codecov.io/gh/apache/incubator-pinot/pull/4321/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vY29uZmlnL0luZGV4aW5nQ29uZmlnLmphdmE=) | `53.57% <50%> (+0.3%)` | `0 <0> (ø)` | :arrow_down: | | [...indexsegment/generator/SegmentGeneratorConfig.java](https://codecov.io/gh/apache/incubator-pinot/pull/4321/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9pbmRleHNlZ21lbnQvZ2VuZXJhdG9yL1NlZ21lbnRHZW5lcmF0b3JDb25maWcuamF2YQ==) | `61.96% <57.14%> (+9.95%)` | `0 <0> (ø)` | :arrow_down: | | ... and [386 more](https://codecov.io/gh/apa