[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r392604325 ## File path: gora-tutorial/src/main/java/org/apache/gora/tutorial/log/LogAnalyticsJet.java ## @@ -0,0 +1,91 @@ +package org.apache.gora.tutorial.log; Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r322020832 ## File path: gora-jet/pom.xml ## @@ -0,0 +1,131 @@ + +http://maven.apache.org/POM/4.0.0; + xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; + xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd;> +4.0.0 + + +org.apache.gora +gora +0.9-SNAPSHOT Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r317183898 ## File path: pom.xml ## @@ -792,6 +792,7 @@ gora-ignite gora-tutorial sources-dist +gora-jet Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r316810640 ## File path: pom.xml ## @@ -840,7 +841,8 @@ 1.0.0 -3.6.4 +3.12.2 Review comment: According to [https://mvnrepository.com/artifact/com.hazelcast/hazelcast](url) 3.12.2 is the latest hazelcast version. Isn't it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r312711875 ## File path: gora-jet/src/main/java/org/apache/gora/jet/JetSource.java ## @@ -0,0 +1,108 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.gora.jet; + +import com.hazelcast.jet.Traverser; +import com.hazelcast.jet.core.AbstractProcessor; +import com.hazelcast.jet.core.ProcessorMetaSupplier; +import com.hazelcast.jet.core.ProcessorSupplier; +import com.hazelcast.nio.Address; +import org.apache.gora.persistency.impl.PersistentBase; +import org.apache.gora.query.PartitionQuery; +import org.apache.gora.query.Result; + +import javax.annotation.Nonnull; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.function.Function; + +import static com.hazelcast.jet.Traversers.traverseIterable; +import static java.util.stream.Collectors.toList; +import static java.util.stream.IntStream.range; + +/** + * jet-source implementation. + */ +public class JetSource implements ProcessorMetaSupplier { + + private int totalParallelism; + private transient int localParallelism; + + @Override + public void init(@Nonnull Context context) { +totalParallelism = context.totalParallelism(); +localParallelism = context.localParallelism(); + } + + @Nonnull + @Override + public Function get(@Nonnull List addresses) { +Map map = new HashMap<>(); +for (int i = 0; i < addresses.size(); i++) { + // We'll calculate the global index of each processor in the cluster: + //globalIndexBase is the first processor index in a certain Jet-Cluster member + int globalIndexBase = localParallelism * i; + + // processorCount will be equal to localParallelism: + ProcessorSupplier supplier = processorCount -> + range(globalIndexBase, globalIndexBase + processorCount) + .mapToObj(globalIndex -> + new GoraJetProcessor(getPartionedData(globalIndex)) + ).collect(toList()); + map.put(addresses.get(i), supplier); +} +return map::get; + } + + List> getPartionedData(int globalIndex) { Review comment: Fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r312711871 ## File path: gora-jet/src/main/java/org/apache/gora/jet/JetSource.java ## @@ -0,0 +1,108 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.gora.jet; + +import com.hazelcast.jet.Traverser; +import com.hazelcast.jet.core.AbstractProcessor; +import com.hazelcast.jet.core.ProcessorMetaSupplier; +import com.hazelcast.jet.core.ProcessorSupplier; +import com.hazelcast.nio.Address; +import org.apache.gora.persistency.impl.PersistentBase; +import org.apache.gora.query.PartitionQuery; +import org.apache.gora.query.Result; + +import javax.annotation.Nonnull; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.function.Function; + +import static com.hazelcast.jet.Traversers.traverseIterable; +import static java.util.stream.Collectors.toList; +import static java.util.stream.IntStream.range; + +/** + * jet-source implementation. + */ +public class JetSource implements ProcessorMetaSupplier { + + private int totalParallelism; + private transient int localParallelism; + + @Override + public void init(@Nonnull Context context) { +totalParallelism = context.totalParallelism(); +localParallelism = context.localParallelism(); + } + + @Nonnull + @Override + public Function get(@Nonnull List addresses) { +Map map = new HashMap<>(); +for (int i = 0; i < addresses.size(); i++) { + // We'll calculate the global index of each processor in the cluster: + //globalIndexBase is the first processor index in a certain Jet-Cluster member + int globalIndexBase = localParallelism * i; + + // processorCount will be equal to localParallelism: + ProcessorSupplier supplier = processorCount -> + range(globalIndexBase, globalIndexBase + processorCount) + .mapToObj(globalIndex -> + new GoraJetProcessor(getPartionedData(globalIndex)) + ).collect(toList()); + map.put(addresses.get(i), supplier); +} +return map::get; + } + + List> getPartionedData(int globalIndex) { Review comment: Fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r312711881 ## File path: gora-jet/src/main/java/org/apache/gora/jet/JetSource.java ## @@ -0,0 +1,108 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.gora.jet; + +import com.hazelcast.jet.Traverser; +import com.hazelcast.jet.core.AbstractProcessor; +import com.hazelcast.jet.core.ProcessorMetaSupplier; +import com.hazelcast.jet.core.ProcessorSupplier; +import com.hazelcast.nio.Address; +import org.apache.gora.persistency.impl.PersistentBase; +import org.apache.gora.query.PartitionQuery; +import org.apache.gora.query.Result; + +import javax.annotation.Nonnull; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.function.Function; + +import static com.hazelcast.jet.Traversers.traverseIterable; +import static java.util.stream.Collectors.toList; +import static java.util.stream.IntStream.range; + +/** + * jet-source implementation. + */ +public class JetSource implements ProcessorMetaSupplier { + + private int totalParallelism; + private transient int localParallelism; + + @Override + public void init(@Nonnull Context context) { +totalParallelism = context.totalParallelism(); +localParallelism = context.localParallelism(); + } + + @Nonnull + @Override + public Function get(@Nonnull List addresses) { +Map map = new HashMap<>(); +for (int i = 0; i < addresses.size(); i++) { + // We'll calculate the global index of each processor in the cluster: + //globalIndexBase is the first processor index in a certain Jet-Cluster member + int globalIndexBase = localParallelism * i; + + // processorCount will be equal to localParallelism: + ProcessorSupplier supplier = processorCount -> + range(globalIndexBase, globalIndexBase + processorCount) + .mapToObj(globalIndex -> + new GoraJetProcessor(getPartionedData(globalIndex)) + ).collect(toList()); + map.put(addresses.get(i), supplier); +} +return map::get; + } + + List> getPartionedData(int globalIndex) { Review comment: Fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r312711890 ## File path: gora-jet/src/test/java/org/apache/gora/jet/JetTest.java ## @@ -0,0 +1,149 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.gora.jet; + +import com.hazelcast.core.IMap; +import com.hazelcast.jet.Jet; +import com.hazelcast.jet.JetInstance; +import com.hazelcast.jet.pipeline.BatchSource; +import com.hazelcast.jet.pipeline.Pipeline; +import com.hazelcast.jet.pipeline.Sinks; +import org.apache.gora.jet.generated.Pageview; +import org.apache.gora.jet.generated.ResultPageView; +import org.apache.gora.query.Query; +import org.apache.gora.query.Result; +import org.apache.gora.store.DataStore; +import org.apache.gora.store.DataStoreFactory; +import org.apache.gora.util.GoraException; +import org.apache.hadoop.hbase.HBaseTestingUtility; +import org.junit.BeforeClass; +import org.junit.Test; + +import java.util.regex.Pattern; + +import static com.hazelcast.jet.Traversers.traverseArray; +import static com.hazelcast.jet.aggregate.AggregateOperations.counting; +import static com.hazelcast.jet.function.Functions.wholeItem; +import static org.junit.Assert.assertEquals; + +/** + * Test case for jet sink and source connectors. + */ +public class JetTest { + + private static DataStore dataStore; Review comment: Fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r312711885 ## File path: gora-jet/src/test/java/org/apache/gora/jet/JetTest.java ## @@ -0,0 +1,149 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.gora.jet; + +import com.hazelcast.core.IMap; +import com.hazelcast.jet.Jet; +import com.hazelcast.jet.JetInstance; +import com.hazelcast.jet.pipeline.BatchSource; +import com.hazelcast.jet.pipeline.Pipeline; +import com.hazelcast.jet.pipeline.Sinks; +import org.apache.gora.jet.generated.Pageview; +import org.apache.gora.jet.generated.ResultPageView; +import org.apache.gora.query.Query; +import org.apache.gora.query.Result; +import org.apache.gora.store.DataStore; +import org.apache.gora.store.DataStoreFactory; +import org.apache.gora.util.GoraException; +import org.apache.hadoop.hbase.HBaseTestingUtility; +import org.junit.BeforeClass; +import org.junit.Test; + +import java.util.regex.Pattern; + +import static com.hazelcast.jet.Traversers.traverseArray; +import static com.hazelcast.jet.aggregate.AggregateOperations.counting; +import static com.hazelcast.jet.function.Functions.wholeItem; +import static org.junit.Assert.assertEquals; + +/** + * Test case for jet sink and source connectors. + */ +public class JetTest { + + private static DataStore dataStore; + private static DataStore dataStoreOut; + static Query query = null; Review comment: Fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r312711862 ## File path: gora-jet/src/main/java/org/apache/gora/jet/JetSink.java ## @@ -0,0 +1,90 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.gora.jet; + +import com.hazelcast.jet.core.AbstractProcessor; +import com.hazelcast.jet.core.ProcessorMetaSupplier; +import com.hazelcast.jet.core.ProcessorSupplier; +import com.hazelcast.nio.Address; +import org.apache.gora.persistency.impl.PersistentBase; +import org.apache.gora.util.GoraException; + +import javax.annotation.Nonnull; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.function.Function; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.IntStream.range; + +/** + * jet-sink implementation. + */ +public class JetSink implements ProcessorMetaSupplier { + + private transient int localParallelism; + + @Override + public void init(@Nonnull Context context) { +localParallelism = context.localParallelism(); + } + + @Nonnull + @Override + public Function get(@Nonnull List addresses) { +Map map = new HashMap<>(); +for (int i = 0; i < addresses.size(); i++) { + //globalIndexBase is the first processor index in a certain Jet-Cluster member + int globalIndexBase = localParallelism * i; + + // processorCount will be equal to localParallelism: + ProcessorSupplier supplier = processorCount -> + range(globalIndexBase, globalIndexBase + processorCount) + .mapToObj(globalIndex -> + new SinkProcessor() + ).collect(toList()); + map.put(addresses.get(i), supplier); +} +return map::get; + } +} + +class SinkProcessor extends AbstractProcessor { + + @Override + public boolean isCooperative() { +return false; + } + + @Override + protected boolean tryProcess(int ordinal, Object item) throws Exception { Review comment: Fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r312711846 ## File path: gora-jet/src/main/java/org/apache/gora/jet/JetEngine.java ## @@ -0,0 +1,55 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.gora.jet; + +import com.hazelcast.jet.pipeline.BatchSource; +import com.hazelcast.jet.pipeline.Sink; +import com.hazelcast.jet.pipeline.Sinks; +import com.hazelcast.jet.pipeline.Sources; +import org.apache.gora.persistency.impl.PersistentBase; +import org.apache.gora.query.Query; +import org.apache.gora.store.DataStore; + +/** + * Core class which handles Gora - Jet Engine integration. + */ +public class JetEngine { + public static DataStore dataOutStore; + public static DataStore dataInStore; + public static Query query; + + public BatchSource> createDataSource(DataStore dataOutStore) { +return createDataSource(dataOutStore, dataOutStore.newQuery()); + } + + public BatchSource> createDataSource(DataStore dataOutStore, + Query query) { +JetEngine.dataInStore = dataOutStore; +JetEngine.query = query; +BatchSource> source = Sources.batchFromProcessor("gora-jet-source", +new JetSource()); +return source; + } + + public Sink> createDataSink(DataStore dataOutStore) { +JetEngine.dataOutStore = dataOutStore; +Sink> sink = Sinks.fromProcessor("gora-jet-sink", Review comment: Fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r312711854 ## File path: gora-jet/src/main/java/org/apache/gora/jet/JetEngine.java ## @@ -0,0 +1,55 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.gora.jet; + +import com.hazelcast.jet.pipeline.BatchSource; +import com.hazelcast.jet.pipeline.Sink; +import com.hazelcast.jet.pipeline.Sinks; +import com.hazelcast.jet.pipeline.Sources; +import org.apache.gora.persistency.impl.PersistentBase; +import org.apache.gora.query.Query; +import org.apache.gora.store.DataStore; + +/** + * Core class which handles Gora - Jet Engine integration. + */ +public class JetEngine { + public static DataStore dataOutStore; + public static DataStore dataInStore; + public static Query query; + + public BatchSource> createDataSource(DataStore dataOutStore) { +return createDataSource(dataOutStore, dataOutStore.newQuery()); + } + + public BatchSource> createDataSource(DataStore dataOutStore, + Query query) { +JetEngine.dataInStore = dataOutStore; +JetEngine.query = query; +BatchSource> source = Sources.batchFromProcessor("gora-jet-source", Review comment: Fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r312711860 ## File path: gora-jet/src/main/java/org/apache/gora/jet/JetInputOutputFormat.java ## @@ -0,0 +1,50 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.gora.jet; + +import org.apache.gora.persistency.impl.PersistentBase; + +/** + * Wrapper class which will be used to fetch data from data stores to Gora- + * jet-source and to write data into data stores through Gora-jet-sink. + */ +public class JetInputOutputFormat { + public KeyOut key; Review comment: Fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r312711838 ## File path: gora-jet/src/main/java/org/apache/gora/jet/JetEngine.java ## @@ -0,0 +1,55 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.gora.jet; + +import com.hazelcast.jet.pipeline.BatchSource; +import com.hazelcast.jet.pipeline.Sink; +import com.hazelcast.jet.pipeline.Sinks; +import com.hazelcast.jet.pipeline.Sources; +import org.apache.gora.persistency.impl.PersistentBase; +import org.apache.gora.query.Query; +import org.apache.gora.store.DataStore; + +/** + * Core class which handles Gora - Jet Engine integration. + */ +public class JetEngine { + public static DataStore dataOutStore; Review comment: Fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r312711837 ## File path: gora-jet/pom.xml ## @@ -0,0 +1,132 @@ + +http://maven.apache.org/POM/4.0.0; + xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; + xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd;> +4.0.0 + + +org.apache.gora +gora +0.9-SNAPSHOT +../ + + +gora-jet +bundle + +Apache Gora :: Jet +http://gora.apache.org +Jet -> Gora -> Jet Sink and Source connectors +2019 Review comment: Fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r312711418 ## File path: gora-jet/src/main/java/org/apache/gora/jet/JetEngine.java ## @@ -0,0 +1,55 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.gora.jet; + +import com.hazelcast.jet.pipeline.BatchSource; +import com.hazelcast.jet.pipeline.Sink; +import com.hazelcast.jet.pipeline.Sinks; +import com.hazelcast.jet.pipeline.Sources; +import org.apache.gora.persistency.impl.PersistentBase; +import org.apache.gora.query.Query; +import org.apache.gora.store.DataStore; + +/** + * Core class which handles Gora - Jet Engine integration. + */ +public class JetEngine { + public static DataStore dataOutStore; + public static DataStore dataInStore; + public static Query query; + + public BatchSource> createDataSource(DataStore dataOutStore) { +return createDataSource(dataOutStore, dataOutStore.newQuery()); + } + + public BatchSource> createDataSource(DataStore dataOutStore, Review comment: actually these methods are the methods exposed to end user to create jet-source and sink. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r308899303 ## File path: gora-jet/src/test/java/org/apache/gora/jet/JetTest.java ## @@ -0,0 +1,132 @@ +package org.apache.gora.jet; + +import com.hazelcast.core.IMap; +import com.hazelcast.jet.Jet; +import com.hazelcast.jet.JetInstance; +import com.hazelcast.jet.pipeline.BatchSource; +import com.hazelcast.jet.pipeline.Pipeline; +import com.hazelcast.jet.pipeline.Sinks; +import org.apache.gora.jet.generated.Pageview; +import org.apache.gora.jet.generated.ResultPageView; +import org.apache.gora.query.Query; +import org.apache.gora.store.DataStore; +import org.apache.gora.store.DataStoreFactory; +import org.apache.gora.util.GoraException; +import org.apache.hadoop.conf.Configuration; +import org.junit.Test; + +import java.util.regex.Pattern; + +import static com.hazelcast.jet.Traversers.traverseArray; +import static com.hazelcast.jet.aggregate.AggregateOperations.counting; +import static com.hazelcast.jet.function.Functions.wholeItem; + +public class JetTest { + + private static DataStore dataStore; + private static DataStore dataStoreOut; + static Query query = null; + + @Test + public void testNewJetSource() { + +Configuration conf = new Configuration(); + +try { + dataStore = DataStoreFactory.getDataStore(Long.class, Pageview.class, conf); +} catch (GoraException e) { + e.printStackTrace(); +} + +try { + dataStoreOut = DataStoreFactory.getDataStore(Long.class, ResultPageView.class, conf); +} catch (GoraException e) { + e.printStackTrace(); +} + +query = dataStore.newQuery(); +query.setStartKey(0L); +query.setEndKey(55L); + +JetEngine jetEngine = new JetEngine<>(); +BatchSource> fileSource = jetEngine.createDataSource(dataStore, query); +Pipeline p = Pipeline.create(); +p.drawFrom(fileSource) +.filter(item -> item.getValue().getIp().toString().equals("88.240.129.183")) +.map(e -> { + ResultPageView resultPageView = new ResultPageView(); + resultPageView.setIp(e.getValue().getIp()); + resultPageView.setTimestamp(e.getValue().getTimestamp()); + resultPageView.setUrl(e.getValue().getUrl()); + return new JetInputOutputFormat(e.getValue().getTimestamp(), resultPageView); +}) +.drainTo(jetEngine.createDataSink(dataStoreOut)); + +JetInstance jet = Jet.newJetInstance(); +Jet.newJetInstance(); +try { + jet.newJob(p).join(); +} finally { + Jet.shutdownAll(); +} + } + + @Test + public void insertData() { +try { + dataStoreOut = DataStoreFactory.getDataStore(Long.class, ResultPageView.class, new Configuration()); +} catch (GoraException e) { + e.printStackTrace(); +} + +ResultPageView resultPageView = new ResultPageView(); +resultPageView.setIp("123"); +resultPageView.setTimestamp(123L); +resultPageView.setUrl("I am the the one"); + +ResultPageView resultPageView1 = new ResultPageView(); +resultPageView1.setIp("123"); +resultPageView1.setTimestamp(123L); +resultPageView1.setUrl("How are you"); + +try { + dataStoreOut.put(1L,resultPageView); + dataStoreOut.put(2L,resultPageView1); + dataStoreOut.flush(); +} catch (GoraException e) { + e.printStackTrace(); +} + } + + @Test + public void jetWordCount() { +try { + dataStoreOut = DataStoreFactory.getDataStore(Long.class, ResultPageView.class, new Configuration()); +} catch (GoraException e) { + e.printStackTrace(); +} +Query query = dataStoreOut.newQuery(); +JetEngine jetEngine = new JetEngine<>(); + +Pattern delimiter = Pattern.compile("\\W+"); +Pipeline p = Pipeline.create(); +p.drawFrom(jetEngine.createDataSource(dataStoreOut, query)) +.flatMap(e -> traverseArray(delimiter.split(e.getValue().getUrl().toString().toLowerCase( +.filter(word -> !word.isEmpty()) +.groupingKey(wholeItem()) +.aggregate(counting()) +.drainTo(Sinks.map("COUNTS")); +JetInstance jet = Jet.newJetInstance();; +try { + System.out.print("\nCounting words... "); + jet.newJob(p).join(); + IMap counts = jet.getMap("COUNTS"); + if (counts.get("the") != 2) { Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r308899182 ## File path: gora-jet/src/test/java/org/apache/gora/jet/JetTest.java ## @@ -0,0 +1,132 @@ +package org.apache.gora.jet; + +import com.hazelcast.core.IMap; +import com.hazelcast.jet.Jet; +import com.hazelcast.jet.JetInstance; +import com.hazelcast.jet.pipeline.BatchSource; +import com.hazelcast.jet.pipeline.Pipeline; +import com.hazelcast.jet.pipeline.Sinks; +import org.apache.gora.jet.generated.Pageview; +import org.apache.gora.jet.generated.ResultPageView; +import org.apache.gora.query.Query; +import org.apache.gora.store.DataStore; +import org.apache.gora.store.DataStoreFactory; +import org.apache.gora.util.GoraException; +import org.apache.hadoop.conf.Configuration; +import org.junit.Test; + +import java.util.regex.Pattern; + +import static com.hazelcast.jet.Traversers.traverseArray; +import static com.hazelcast.jet.aggregate.AggregateOperations.counting; +import static com.hazelcast.jet.function.Functions.wholeItem; + +public class JetTest { + + private static DataStore dataStore; + private static DataStore dataStoreOut; + static Query query = null; + + @Test + public void testNewJetSource() { + +Configuration conf = new Configuration(); + +try { + dataStore = DataStoreFactory.getDataStore(Long.class, Pageview.class, conf); +} catch (GoraException e) { + e.printStackTrace(); +} + +try { + dataStoreOut = DataStoreFactory.getDataStore(Long.class, ResultPageView.class, conf); +} catch (GoraException e) { + e.printStackTrace(); +} + +query = dataStore.newQuery(); +query.setStartKey(0L); +query.setEndKey(55L); + +JetEngine jetEngine = new JetEngine<>(); +BatchSource> fileSource = jetEngine.createDataSource(dataStore, query); +Pipeline p = Pipeline.create(); +p.drawFrom(fileSource) +.filter(item -> item.getValue().getIp().toString().equals("88.240.129.183")) +.map(e -> { + ResultPageView resultPageView = new ResultPageView(); + resultPageView.setIp(e.getValue().getIp()); + resultPageView.setTimestamp(e.getValue().getTimestamp()); + resultPageView.setUrl(e.getValue().getUrl()); + return new JetInputOutputFormat(e.getValue().getTimestamp(), resultPageView); +}) +.drainTo(jetEngine.createDataSink(dataStoreOut)); + +JetInstance jet = Jet.newJetInstance(); +Jet.newJetInstance(); +try { + jet.newJob(p).join(); +} finally { + Jet.shutdownAll(); +} + } + + @Test + public void insertData() { +try { + dataStoreOut = DataStoreFactory.getDataStore(Long.class, ResultPageView.class, new Configuration()); +} catch (GoraException e) { + e.printStackTrace(); +} + +ResultPageView resultPageView = new ResultPageView(); +resultPageView.setIp("123"); +resultPageView.setTimestamp(123L); +resultPageView.setUrl("I am the the one"); + +ResultPageView resultPageView1 = new ResultPageView(); +resultPageView1.setIp("123"); +resultPageView1.setTimestamp(123L); +resultPageView1.setUrl("How are you"); + +try { + dataStoreOut.put(1L,resultPageView); + dataStoreOut.put(2L,resultPageView1); + dataStoreOut.flush(); Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r305626579 ## File path: gora-jet/src/main/java/org/apache/gora/jet/JetInputOutputFormat.java ## @@ -0,0 +1,29 @@ +package org.apache.gora.jet; + +import org.apache.gora.persistency.impl.PersistentBase; + +public class JetInputOutputFormat { Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r305626584 ## File path: gora-jet/src/main/java/org/apache/gora/jet/JetSource.java ## @@ -0,0 +1,89 @@ +package org.apache.gora.jet; + +import com.hazelcast.jet.Traverser; +import com.hazelcast.jet.core.AbstractProcessor; +import com.hazelcast.jet.core.ProcessorMetaSupplier; +import com.hazelcast.jet.core.ProcessorSupplier; +import com.hazelcast.nio.Address; +import org.apache.gora.persistency.impl.PersistentBase; +import org.apache.gora.query.PartitionQuery; +import org.apache.gora.query.Result; + +import javax.annotation.Nonnull; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.function.Function; + +import static com.hazelcast.jet.Traversers.traverseIterable; +import static java.util.stream.Collectors.toList; +import static java.util.stream.IntStream.range; + +public class JetSource implements ProcessorMetaSupplier { + + private int totalParallelism; + private transient int localParallelism; + + @Override + public void init(@Nonnull Context context) { +totalParallelism = context.totalParallelism(); +localParallelism = context.localParallelism(); + } + + @Nonnull + @Override + public Function get(@Nonnull List addresses) { +Map map = new HashMap<>(); +for (int i = 0; i < addresses.size(); i++) { + // We'll calculate the global index of each processor in the cluster: + //globalIndexBase is the first processor index in a certain Jet-Cluster member + int globalIndexBase = localParallelism * i; + + // processorCount will be equal to localParallelism: + ProcessorSupplier supplier = processorCount -> + range(globalIndexBase, globalIndexBase + processorCount) + .mapToObj(globalIndex -> + new GoraJetProcessor(getPartionedData(globalIndex)) + ).collect(toList()); + map.put(addresses.get(i), supplier); +} +return map::get; + } + + List> getPartionedData(int globalIndex) { +try { + List> partitionQueries = JetEngine.dataInStore.getPartitions(JetEngine.query); + List> resultsList = new ArrayList<>(); + int i = 1; + int partitionNo = globalIndex; + while (partitionNo < partitionQueries.size()) { +Result result = null; +result = partitionQueries.get(partitionNo).execute(); +while (result.next()) { + resultsList.add(new JetInputOutputFormat<>(result.getKey(), result.get())); +} +partitionNo = (i * totalParallelism) + globalIndex; +i++; + } + return resultsList; +} catch (Exception e) { + e.printStackTrace(); Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support URL: https://github.com/apache/gora/pull/175#discussion_r305626570 ## File path: gora-jet/src/main/java/org/apache/gora/jet/JetInputOutputFormat.java ## @@ -0,0 +1,29 @@ +package org.apache.gora.jet; + Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services