Accumulo-Pull-Requests - Build # 1382 - Still Unstable

2018-07-30 Thread Apache Jenkins Server
The Apache Jenkins build system has built Accumulo-Pull-Requests (build #1382)

Status: Still Unstable

Check console output at 
https://builds.apache.org/job/Accumulo-Pull-Requests/1382/ to view the results.

[GitHub] keith-turner commented on a change in pull request #560: Provide new Crypto interface & impl

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #560: Provide new Crypto 
interface & impl
URL: https://github.com/apache/accumulo/pull/560#discussion_r206333446
 
 

 ##
 File path: 
core/src/main/java/org/apache/accumulo/core/security/crypto/CryptoEnvironment.java
 ##
 @@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+package org.apache.accumulo.core.security.crypto;
+
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.IOException;
+import java.util.Map;
+import java.util.Objects;
+
+public class CryptoEnvironment {
 
 Review comment:
   For API could make this an interface with only getter methods.  Should add 
since tags


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #560: Provide new Crypto interface & impl

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #560: Provide new Crypto 
interface & impl
URL: https://github.com/apache/accumulo/pull/560#discussion_r206335529
 
 

 ##
 File path: 
core/src/main/java/org/apache/accumulo/core/security/crypto/CryptoService.java
 ##
 @@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+package org.apache.accumulo.core.security.crypto;
+
+import java.util.Map;
+
+/**
+ * Self contained cryptographic service. All on disk encryption and decryption 
will take place
+ * through this interface. Each implementation must implement a {@link 
FileEncrypter} for encryption
+ * and a {@link FileDecrypter} for decryption.
+ *
+ * @since 2.0
+ */
+public interface CryptoService {
+
+  /**
+   * Initialize CryptoService. This is called once at Tablet Server startup.
+   *
+   * @since 2.0
+   */
+  void init(Map conf) throws CryptoException;
 
 Review comment:
   Could possibly pass an interface like CryptoEnv to init inorder to support 
easier API evolution.  Not sure what to name it.
   
   Also could possibly replace init with a factory like :
   
   ```java
   interface CyrptoServiceFactory {
 CryptoService newCryptoService(Map conf);
   }
   ```
   I am not sure if this better than init, its just something to consider.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #560: Provide new Crypto interface & impl

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #560: Provide new Crypto 
interface & impl
URL: https://github.com/apache/accumulo/pull/560#discussion_r206333687
 
 

 ##
 File path: 
core/src/main/java/org/apache/accumulo/core/security/crypto/CryptoEnvironment.java
 ##
 @@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+package org.apache.accumulo.core.security.crypto;
 
 Review comment:
   Could move all of the crypto APIs to org.apache.accumulo.core.spi.crypto


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jmark99 commented on issue #575: Add splits to table at time of table creation #573

2018-07-30 Thread GitBox
jmark99 commented on issue #575: Add splits to table at time of table creation 
#573
URL: https://github.com/apache/accumulo/pull/575#issuecomment-409023433
 
 
   @keith-turner, @milleruntime, Thanks for the initial pass over the PR. It is 
an involved change and also my first run at modifying a FATE operation. I will 
begin addressing and modifying the code based on your comments throughout the 
next several days. I'm sure I will have some questions along the way as well. 
Please let me know of any additional suggestions or concerns as they come to 
you.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] adamjshook commented on a change in pull request #574: Add tool to swap out and quarantine corrupt logs

2018-07-30 Thread GitBox
adamjshook commented on a change in pull request #574: Add tool to swap out and 
quarantine corrupt logs
URL: https://github.com/apache/accumulo/pull/574#discussion_r206323354
 
 

 ##
 File path: 
server/tserver/src/main/java/org/apache/accumulo/tserver/log/CorruptWalReplacer.java
 ##
 @@ -0,0 +1,290 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.accumulo.tserver.log;
+
+import static java.lang.String.format;
+import static java.nio.charset.StandardCharsets.UTF_8;
+
+import java.net.URLEncoder;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.TreeSet;
+
+import org.apache.accumulo.core.cli.Help;
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.Connector;
+import org.apache.accumulo.core.client.RowIterator;
+import org.apache.accumulo.core.client.Scanner;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.client.ZooKeeperInstance;
+import org.apache.accumulo.core.client.security.tokens.PasswordToken;
+import org.apache.accumulo.core.conf.AccumuloConfiguration;
+import org.apache.accumulo.core.data.Key;
+import org.apache.accumulo.core.data.Mutation;
+import org.apache.accumulo.core.data.Value;
+import org.apache.accumulo.core.data.impl.KeyExtent;
+import 
org.apache.accumulo.core.metadata.schema.MetadataSchema.TabletsSection.LogColumnFamily;
+import 
org.apache.accumulo.core.metadata.schema.MetadataSchema.TabletsSection.TabletColumnFamily;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.accumulo.server.fs.VolumeManager;
+import org.apache.accumulo.server.fs.VolumeManagerImpl;
+import org.apache.accumulo.server.log.SortedLogState;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.Text;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.beust.jcommander.Parameter;
+import com.google.common.primitives.Bytes;
+
+public class CorruptWalReplacer {
+
+  private static final Logger log = 
LoggerFactory.getLogger(CorruptWalReplacer.class);
+  private static final byte[] EMPTY_WAL_CONTENT = Bytes
+  .concat("--- Log File Header (v2) ---".getBytes(UTF_8), new byte[] {0, 
0, 0, 0});
+
+  private Connector connector;
+  private String quarantineDir;
+  private String workDir;
+
+  public CorruptWalReplacer(String instanceName, String zooKeepers, String 
user, String password,
+  String quarantineDir, String workDir) throws AccumuloSecurityException, 
AccumuloException {
+ZooKeeperInstance instance = new ZooKeeperInstance(instanceName, 
zooKeepers);
+this.connector = instance.getConnector(user, new PasswordToken(password));
+this.quarantineDir = quarantineDir;
+this.workDir = workDir;
+  }
+
+  public void run() throws Exception {
+VolumeManager fs = VolumeManagerImpl.get();
+Path quarantineDir = new Path(this.quarantineDir);
+Path workDir = new Path(this.workDir);
+
+for (KeyExtent extent : getExtents()) {
+  log.info("Checking logs for extent {}", extent);
+  List logs = getLogs(extent);
+
+  if (logs.isEmpty()) {
+log.info("No logs found for key {}", extent);
+continue;
+  }
+
+  // Clean up the working directory, if it exists
+  ArrayList dirs = new ArrayList<>();
+  if (fs.exists(workDir)) {
+log.info("Deleting {}", workDir);
+fs.deleteRecursively(workDir);
+  }
+
+  try {
+// Map containing the name of the file to the full path
+Map nameToSource = new HashMap<>();
+for (Path path : logs) {
+  Path destPath = new Path(workDir, path.getName());
+
+  // Run the log sorter task to prepare for recovery
+  LogSorter.LogSorterTask task = new LogSorter.LogSorterTask(fs,
+  AccumuloConfiguration.getDefaultConfiguration());
+
+  log.info("Invoking sort from path {} to path {}", path, destPath);
+  task.sort(path.getName(), path, 

[GitHub] adamjshook commented on a change in pull request #574: Add tool to swap out and quarantine corrupt logs

2018-07-30 Thread GitBox
adamjshook commented on a change in pull request #574: Add tool to swap out and 
quarantine corrupt logs
URL: https://github.com/apache/accumulo/pull/574#discussion_r206323272
 
 

 ##
 File path: 
server/tserver/src/main/java/org/apache/accumulo/tserver/log/CorruptWalReplacer.java
 ##
 @@ -0,0 +1,290 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.accumulo.tserver.log;
+
+import static java.lang.String.format;
+import static java.nio.charset.StandardCharsets.UTF_8;
+
+import java.net.URLEncoder;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.TreeSet;
+
+import org.apache.accumulo.core.cli.Help;
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.Connector;
+import org.apache.accumulo.core.client.RowIterator;
+import org.apache.accumulo.core.client.Scanner;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.client.ZooKeeperInstance;
+import org.apache.accumulo.core.client.security.tokens.PasswordToken;
+import org.apache.accumulo.core.conf.AccumuloConfiguration;
+import org.apache.accumulo.core.data.Key;
+import org.apache.accumulo.core.data.Mutation;
+import org.apache.accumulo.core.data.Value;
+import org.apache.accumulo.core.data.impl.KeyExtent;
+import 
org.apache.accumulo.core.metadata.schema.MetadataSchema.TabletsSection.LogColumnFamily;
+import 
org.apache.accumulo.core.metadata.schema.MetadataSchema.TabletsSection.TabletColumnFamily;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.accumulo.server.fs.VolumeManager;
+import org.apache.accumulo.server.fs.VolumeManagerImpl;
+import org.apache.accumulo.server.log.SortedLogState;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.Text;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.beust.jcommander.Parameter;
+import com.google.common.primitives.Bytes;
+
+public class CorruptWalReplacer {
+
+  private static final Logger log = 
LoggerFactory.getLogger(CorruptWalReplacer.class);
+  private static final byte[] EMPTY_WAL_CONTENT = Bytes
+  .concat("--- Log File Header (v2) ---".getBytes(UTF_8), new byte[] {0, 
0, 0, 0});
+
+  private Connector connector;
+  private String quarantineDir;
+  private String workDir;
+
+  public CorruptWalReplacer(String instanceName, String zooKeepers, String 
user, String password,
+  String quarantineDir, String workDir) throws AccumuloSecurityException, 
AccumuloException {
+ZooKeeperInstance instance = new ZooKeeperInstance(instanceName, 
zooKeepers);
+this.connector = instance.getConnector(user, new PasswordToken(password));
+this.quarantineDir = quarantineDir;
+this.workDir = workDir;
+  }
+
+  public void run() throws Exception {
+VolumeManager fs = VolumeManagerImpl.get();
+Path quarantineDir = new Path(this.quarantineDir);
+Path workDir = new Path(this.workDir);
+
+for (KeyExtent extent : getExtents()) {
+  log.info("Checking logs for extent {}", extent);
+  List logs = getLogs(extent);
+
+  if (logs.isEmpty()) {
+log.info("No logs found for key {}", extent);
+continue;
+  }
+
+  // Clean up the working directory, if it exists
+  ArrayList dirs = new ArrayList<>();
+  if (fs.exists(workDir)) {
+log.info("Deleting {}", workDir);
+fs.deleteRecursively(workDir);
+  }
+
+  try {
+// Map containing the name of the file to the full path
+Map nameToSource = new HashMap<>();
+for (Path path : logs) {
+  Path destPath = new Path(workDir, path.getName());
+
+  // Run the log sorter task to prepare for recovery
+  LogSorter.LogSorterTask task = new LogSorter.LogSorterTask(fs,
+  AccumuloConfiguration.getDefaultConfiguration());
+
+  log.info("Invoking sort from path {} to path {}", path, destPath);
+  task.sort(path.getName(), path, 

[GitHub] milleruntime commented on a change in pull request #575: Add splits to table at time of table creation #573

2018-07-30 Thread GitBox
milleruntime commented on a change in pull request #575: Add splits to table at 
time of table creation #573
URL: https://github.com/apache/accumulo/pull/575#discussion_r206304067
 
 

 ##
 File path: 
core/src/main/java/org/apache/accumulo/core/client/admin/NewTableConfiguration.java
 ##
 @@ -134,6 +149,13 @@ public NewTableConfiguration 
setProperties(Map props) {
 if (limitVersion)
   
propertyMap.putAll(IteratorUtil.generateInitialTableProperties(limitVersion));
 
+if (createOffline)
+  propertyMap.put(Property.TABLE_OFFLINE_OPTS + "create.offline", "true");
+
+if (createInitialSplits) {
+  propertyMap.put(Property.TABLE_OFFLINE_OPTS + "create.initial.splits", 
"true");
+}
 
 Review comment:
   Along with what Keith said about the Properties, these map keys should 
probably just be internal constants. since it looks like these actions are just 
performed through the API and not something you would set in the config. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] milleruntime commented on a change in pull request #575: Add splits to table at time of table creation #573

2018-07-30 Thread GitBox
milleruntime commented on a change in pull request #575: Add splits to table at 
time of table creation #573
URL: https://github.com/apache/accumulo/pull/575#discussion_r206307873
 
 

 ##
 File path: 
server/master/src/main/java/org/apache/accumulo/master/tableOps/CreateInitialSplits.java
 ##
 @@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.accumulo.master.tableOps;
+
+import java.io.BufferedReader;
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.util.Map;
+
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.BatchWriter;
+import org.apache.accumulo.core.client.MutationsRejectedException;
+import org.apache.accumulo.core.client.Scanner;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.conf.Property;
+import org.apache.accumulo.core.data.Key;
+import org.apache.accumulo.core.data.Mutation;
+import org.apache.accumulo.core.data.Range;
+import org.apache.accumulo.core.data.Value;
+import org.apache.accumulo.core.data.impl.KeyExtent;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.accumulo.fate.Repo;
+import org.apache.accumulo.master.Master;
+import org.apache.accumulo.server.zookeeper.ZooReaderWriter;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.io.Text;
+import org.apache.zookeeper.KeeperException;
+import org.apache.zookeeper.data.Stat;
+
+public class CreateInitialSplits extends MasterRepo {
+
+  private static final long serialVersionUID = 1L;
+
+  private TableInfo tableInfo;
+
+  CreateInitialSplits(TableInfo ti) {
+this.tableInfo = ti;
+  }
+
+  private class SplitEntry {
+Value dirValue;
+Value lockValue;
+Value timeValue;
+Value firstRow;
+String lastSplit;
+boolean first;
+
+public SplitEntry() {
+  dirValue = null;
+  lockValue = null;
+  timeValue = null;
+  firstRow = null;
+  lastSplit = null;
+  first = true;
+}
+  };
+
+  @Override
+  public long isReady(long tid, Master environment) throws Exception {
+return 0;
+  }
+
+  @Override
+  public Repo call(long tid, Master environment) throws Exception {
+String splitFile = getSplitFileName();
+
+// Read table entry and store values for later use
+SplitEntry splitEntry = new SplitEntry();
+populateSplitEntry(environment, splitEntry);
+
+try (BatchWriter bw = 
environment.getConnector().createBatchWriter("accumulo.metadata")) {
+
+  // Read splits from filesystem and write to metadata table.
+  writeSplitsToMetadataTable(environment, splitFile, splitEntry, bw);
+
+  // last row is handled as a special case
+  writeLastRowToMetadataTable(splitEntry, bw);
+}
+
+return new FinishCreateTable(tableInfo);
+  }
+
+  private void writeLastRowToMetadataTable(SplitEntry splitEntry, BatchWriter 
bw)
+  throws MutationsRejectedException {
+Text lastRow = new Text(tableInfo.tableId.toString() + "<");
+Mutation mut = new Mutation(lastRow);
+mut.put(new Text("srv"), new Text("dir"), splitEntry.dirValue);
+mut.put(new Text("srv"), new Text("lock"), splitEntry.lockValue);
+mut.put(new Text("srv"), new Text("time"), splitEntry.timeValue);
+mut.put(new Text("~tab"), new Text("~pr"),
+KeyExtent.encodePrevEndRow(new Text(splitEntry.lastSplit)));
+bw.addMutation(mut);
+  }
+
+  private void writeSplitsToMetadataTable(Master environment, String splitFile,
+  SplitEntry splitEntry, BatchWriter bw) throws IOException {
+String tableId = tableInfo.tableId.toString();
+try (FSDataInputStream splitStream = 
environment.getInputStream(splitFile)) {
+  try (BufferedReader br = new BufferedReader(new 
InputStreamReader(splitStream))) {
+String split;
+while ((split = br.readLine()) != null) {
+  try {
+Mutation mut = new Mutation(tableId + ";" + split);
+mut.put(new Text("srv"), new Text("dir"), splitEntry.dirValue);
+mut.put(new Text("srv"), new Text("lock"), splitEntry.lockValue);
+

[GitHub] milleruntime commented on a change in pull request #575: Add splits to table at time of table creation #573

2018-07-30 Thread GitBox
milleruntime commented on a change in pull request #575: Add splits to table at 
time of table creation #573
URL: https://github.com/apache/accumulo/pull/575#discussion_r206306925
 
 

 ##
 File path: 
server/master/src/main/java/org/apache/accumulo/master/FateServiceHandler.java
 ##
 @@ -664,4 +670,41 @@ private String validateNamespaceArgument(ByteBuffer 
namespaceArg, TableOperation
   TableOperationExceptionType.INVALID_NAME, why);
 }
   }
+
+  /**
+   * Create a file on the file system to hold the splits to be created at 
table creation.
+   */
+  private String createSplitFile(final long opid, final List 
arguments,
 
 Review comment:
   This is interesting and seems like a first for fate transactions.  I am just 
curious of your thoughts for storing the splits in a file in tmp?  Wouldn't 
this be a problem with recovery if the server is restarted in the middle of the 
fate transaction and the file is gone?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] milleruntime commented on a change in pull request #575: Add splits to table at time of table creation #573

2018-07-30 Thread GitBox
milleruntime commented on a change in pull request #575: Add splits to table at 
time of table creation #573
URL: https://github.com/apache/accumulo/pull/575#discussion_r206305380
 
 

 ##
 File path: 
core/src/test/java/org/apache/accumulo/core/client/admin/NewTableConfigurationTest.java
 ##
 @@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.accumulo.core.client.admin;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.SortedSet;
+import java.util.TreeSet;
+
+import org.apache.accumulo.core.conf.Property;
+import org.apache.hadoop.io.Text;
+import org.junit.Before;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class NewTableConfigurationTest {
 
 Review comment:
   Nice test.  There is also NewTableConfigurationIT you could add tests to. 
   
   Since this is new, you could create a follow on ticket to have unit tests 
for the other parameters in NewTableConfiguration tested here as well.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #560: Provide new Crypto interface & impl

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #560: Provide new Crypto 
interface & impl
URL: https://github.com/apache/accumulo/pull/560#discussion_r206309710
 
 

 ##
 File path: core/src/main/java/org/apache/accumulo/core/conf/Property.java
 ##
 @@ -238,6 +168,20 @@
   PropertyType.STRING,
   "One-line configuration property controlling the network locations "
   + "(hostnames) that are allowed to impersonate other users"),
+  // Crypto-related properties
 
 Review comment:
   Really nice simplification


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #560: Provide new Crypto interface & impl

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #560: Provide new Crypto 
interface & impl
URL: https://github.com/apache/accumulo/pull/560#discussion_r206307946
 
 

 ##
 File path: 
core/src/main/java/org/apache/accumulo/core/security/crypto/FileDecrypter.java
 ##
 @@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+package org.apache.accumulo.core.security.crypto;
+
+import java.io.InputStream;
+
+public interface FileDecrypter {
+  /**
+   * Decrypt the InputStream
+   *
+   * @since 2.0
+   */
+  InputStream decryptStream(InputStream inputStream) throws 
CryptoService.CryptoException;
 
 Review comment:
   Could provide lifecycle info in javadocs, like how often this is called for 
RFiles and WAL files.  Not sure if this the best place for that info.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #560: Provide new Crypto interface & impl

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #560: Provide new Crypto 
interface & impl
URL: https://github.com/apache/accumulo/pull/560#discussion_r206304543
 
 

 ##
 File path: 
core/src/main/java/org/apache/accumulo/core/security/crypto/CryptoService.java
 ##
 @@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+package org.apache.accumulo.core.security.crypto;
+
+import java.util.Map;
+
+/**
+ * Self contained cryptographic service. All on disk encryption and decryption 
will take place
+ * through this interface. Each implementation must implement a {@link 
FileEncrypter} for encryption
+ * and a {@link FileDecrypter} for decryption.
+ *
+ * @since 2.0
+ */
+public interface CryptoService {
+
+  /**
+   * Initialize CryptoService. This is called once at Tablet Server startup.
+   *
+   * @since 2.0
+   */
+  void init(Map conf) throws CryptoException;
 
 Review comment:
   I noticed that CryptoEnvironment has a getConf method.  Is the expectation 
that the conf passed here and that in CryptoEnvironment are always the same?  
Does the user need to handle the conf changing between the call to init and 
getFileEncrypter?  Would it make sense to drop this init method since 
CryptoEnvironment has the conf?
   
   If a user is implementing this, under what circumstances should they throw 
CryptoException?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #560: Provide new Crypto interface & impl

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #560: Provide new Crypto 
interface & impl
URL: https://github.com/apache/accumulo/pull/560#discussion_r206307315
 
 

 ##
 File path: 
core/src/main/java/org/apache/accumulo/core/security/crypto/CryptoService.java
 ##
 @@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+package org.apache.accumulo.core.security.crypto;
+
+import java.util.Map;
+
+/**
+ * Self contained cryptographic service. All on disk encryption and decryption 
will take place
+ * through this interface. Each implementation must implement a {@link 
FileEncrypter} for encryption
+ * and a {@link FileDecrypter} for decryption.
+ *
+ * @since 2.0
+ */
+public interface CryptoService {
+
+  /**
+   * Initialize CryptoService. This is called once at Tablet Server startup.
+   *
+   * @since 2.0
+   */
+  void init(Map conf) throws CryptoException;
+
+  /**
+   * Initialize the FileEncrypter for the environment and return
+   *
+   * @since 2.0
+   */
+  FileEncrypter getFileEncrypter(CryptoEnvironment environment);
 
 Review comment:
   IT would be nice to provide some info about lifecycle in the javadocs.  I 
assume this method is called once per file?  Also could mention that impl need 
to be threadsafe.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #560: Provide new Crypto interface & impl

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #560: Provide new Crypto 
interface & impl
URL: https://github.com/apache/accumulo/pull/560#discussion_r206302874
 
 

 ##
 File path: 
core/src/main/java/org/apache/accumulo/core/security/crypto/impl/AESCryptoService.java
 ##
 @@ -0,0 +1,517 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+package org.apache.accumulo.core.security.crypto.impl;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.security.InvalidAlgorithmParameterException;
+import java.security.InvalidKeyException;
+import java.security.Key;
+import java.security.NoSuchAlgorithmException;
+import java.security.NoSuchProviderException;
+import java.security.SecureRandom;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Objects;
+
+import javax.crypto.Cipher;
+import javax.crypto.CipherInputStream;
+import javax.crypto.CipherOutputStream;
+import javax.crypto.NoSuchPaddingException;
+import javax.crypto.spec.GCMParameterSpec;
+import javax.crypto.spec.IvParameterSpec;
+
+import org.apache.accumulo.core.security.crypto.BlockedInputStream;
+import org.apache.accumulo.core.security.crypto.BlockedOutputStream;
+import org.apache.accumulo.core.security.crypto.CryptoEnvironment;
+import org.apache.accumulo.core.security.crypto.CryptoService;
+import org.apache.accumulo.core.security.crypto.CryptoUtils;
+import org.apache.accumulo.core.security.crypto.DiscardCloseOutputStream;
+import org.apache.accumulo.core.security.crypto.FileDecrypter;
+import org.apache.accumulo.core.security.crypto.FileEncrypter;
+import org.apache.accumulo.core.security.crypto.RFileCipherOutputStream;
+
+/**
+ * Example implementation of AES encryption for Accumulo
+ */
+public class AESCryptoService implements CryptoService {
+
+  private Key encryptingKek = null;
+  private String encryptingKekId = null;
+  private String encryptingKeyManager = null;
+  // Lets just load keks for reading once
+  private static HashMap decryptingKeys = new 
HashMap();
+
+  @Override
+  public void init(Map conf) throws CryptoException {
+String kekId = conf.get("instance.crypto.opts.kekId");
+String keyMgr = conf.get("instance.crypto.opts.keyManager");
+Objects.requireNonNull(kekId, "Config property table.crypto.opts.kekId is 
required.");
+Objects.requireNonNull(keyMgr, "Config property 
table.crypto.opts.keyManager is required.");
+switch (keyMgr) {
+  case KeyManager.URI:
+this.encryptingKeyManager = keyMgr;
+this.encryptingKekId = kekId;
+this.encryptingKek = KeyManager.loadKekFromUri(kekId);
+break;
+  default:
+throw new CryptoException("Unrecognized key manager");
+}
+
+  }
+
+  @Override
+  public FileEncrypter getFileEncrypter(CryptoEnvironment environment) {
+CryptoModule cm;
+switch (environment.getScope()) {
+  case WAL:
+cm = new AESCBCCryptoModule(this.encryptingKek, this.encryptingKekId,
+this.encryptingKeyManager);
+return cm.getEncrypter();
+
+  case RFILE:
+cm = new AESGCMCryptoModule(this.encryptingKek, this.encryptingKekId,
+this.encryptingKeyManager);
+return cm.getEncrypter();
+
+  default:
+throw new CryptoException("Unknown scope: " + environment.getScope());
+}
+  }
+
+  @Override
+  public FileDecrypter getFileDecrypter(CryptoEnvironment environment) {
+CryptoModule cm;
+ParsedCryptoParameters parsed = 
parseCryptoParameters(environment.getParameters());
+Key kek = loadDecryptionKek(parsed);
+Key fek = KeyManager.unwrapKey(parsed.getEncFek(), kek);
+switch (parsed.getCryptoServiceVersion()) {
+  case AESCBCCryptoModule.VERSION:
+cm = new AESCBCCryptoModule(this.encryptingKek, this.encryptingKekId,
+this.encryptingKeyManager);
+return (cm.getDecrypter(fek));
+  case AESGCMCryptoModule.VERSION:
+cm = new AESGCMCryptoModule(this.encryptingKek, this.encryptingKekId,
+

[GitHub] keith-turner commented on a change in pull request #560: Provide new Crypto interface & impl

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #560: Provide new Crypto 
interface & impl
URL: https://github.com/apache/accumulo/pull/560#discussion_r206308709
 
 

 ##
 File path: 
core/src/main/java/org/apache/accumulo/core/security/crypto/CryptoService.java
 ##
 @@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+package org.apache.accumulo.core.security.crypto;
+
+import java.util.Map;
+
+/**
+ * Self contained cryptographic service. All on disk encryption and decryption 
will take place
+ * through this interface. Each implementation must implement a {@link 
FileEncrypter} for encryption
+ * and a {@link FileDecrypter} for decryption.
+ *
+ * @since 2.0
+ */
+public interface CryptoService {
+
+  /**
+   * Initialize CryptoService. This is called once at Tablet Server startup.
+   *
+   * @since 2.0
+   */
+  void init(Map conf) throws CryptoException;
+
+  /**
+   * Initialize the FileEncrypter for the environment and return
+   *
+   * @since 2.0
 
 Review comment:
   Do not need since tag here because class has it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #560: Provide new Crypto interface & impl

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #560: Provide new Crypto 
interface & impl
URL: https://github.com/apache/accumulo/pull/560#discussion_r206309162
 
 

 ##
 File path: 
core/src/main/java/org/apache/accumulo/core/security/crypto/CryptoService.java
 ##
 @@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+package org.apache.accumulo.core.security.crypto;
+
+import java.util.Map;
+
+/**
+ * Self contained cryptographic service. All on disk encryption and decryption 
will take place
+ * through this interface. Each implementation must implement a {@link 
FileEncrypter} for encryption
+ * and a {@link FileDecrypter} for decryption.
+ *
+ * @since 2.0
+ */
+public interface CryptoService {
+
+  /**
+   * Initialize CryptoService. This is called once at Tablet Server startup.
+   *
+   * @since 2.0
+   */
+  void init(Map conf) throws CryptoException;
+
+  /**
+   * Initialize the FileEncrypter for the environment and return
+   *
+   * @since 2.0
+   */
+  FileEncrypter getFileEncrypter(CryptoEnvironment environment);
+
+  /**
+   * Initialize the FileDecrypter for the environment and return
+   *
+   * @since 2.0
+   */
+  FileDecrypter getFileDecrypter(CryptoEnvironment environment);
 
 Review comment:
   Would it make more sense to drop parameter info from env and change this 
method to :
   
   ```java
   FileDecrypter getFileDecrypter(CryptoEnvironment environment, byte[] 
parameters);
   ``` 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #560: Provide new Crypto interface & impl

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #560: Provide new Crypto 
interface & impl
URL: https://github.com/apache/accumulo/pull/560#discussion_r206308319
 
 

 ##
 File path: 
core/src/main/java/org/apache/accumulo/core/security/crypto/FileEncrypter.java
 ##
 @@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+package org.apache.accumulo.core.security.crypto;
+
+import java.io.OutputStream;
+
+public interface FileEncrypter {
+  /**
+   * Encrypt the OutputStream.
+   *
+   * @since 2.0
 
 Review comment:
   You can put `@since 2.0` at class level javadoc and all method will inherit 
it unless overridden. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #575: Add splits to table at time of table creation #573

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #575: Add splits to table at 
time of table creation #573
URL: https://github.com/apache/accumulo/pull/575#discussion_r206277794
 
 

 ##
 File path: core/src/main/java/org/apache/accumulo/core/conf/Property.java
 ##
 @@ -864,10 +864,11 @@
   "Prefix for configuring summarizers for a table. Using this prefix"
   + " multiple summarizers can be configured with options for each 
one. Each"
   + " summarizer configured should have a unique id, this id can be 
anything."
-  + " To add a summarizer set "
-  + "`table.summarizer.=.` If the 
summarizer has options"
-  + ", then for each option set" + " `table.summarizer..opt.=`."),
-
+  + " To add a summarizer set table.summarizer.=. If the summarizer has options, then for each option set"
+  + " table.summarizer..opt.=."),
+  TABLE_OFFLINE_OPTS("table.offline.", null, PropertyType.PREFIX,
 
 Review comment:
   Looking at how this is used it doesn't seem like something that should be a 
prefix that is documented for users (this will end up in the user manual).  It 
seems like its just for internal use for creating tables.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #575: Add splits to table at time of table creation #573

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #575: Add splits to table at 
time of table creation #573
URL: https://github.com/apache/accumulo/pull/575#discussion_r206295436
 
 

 ##
 File path: 
server/master/src/main/java/org/apache/accumulo/master/tableOps/CreateInitialSplits.java
 ##
 @@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.accumulo.master.tableOps;
+
+import java.io.BufferedReader;
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.util.Map;
+
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.BatchWriter;
+import org.apache.accumulo.core.client.MutationsRejectedException;
+import org.apache.accumulo.core.client.Scanner;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.conf.Property;
+import org.apache.accumulo.core.data.Key;
+import org.apache.accumulo.core.data.Mutation;
+import org.apache.accumulo.core.data.Range;
+import org.apache.accumulo.core.data.Value;
+import org.apache.accumulo.core.data.impl.KeyExtent;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.accumulo.fate.Repo;
+import org.apache.accumulo.master.Master;
+import org.apache.accumulo.server.zookeeper.ZooReaderWriter;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.io.Text;
+import org.apache.zookeeper.KeeperException;
+import org.apache.zookeeper.data.Stat;
+
+public class CreateInitialSplits extends MasterRepo {
+
+  private static final long serialVersionUID = 1L;
+
+  private TableInfo tableInfo;
+
+  CreateInitialSplits(TableInfo ti) {
+this.tableInfo = ti;
+  }
+
+  private class SplitEntry {
+Value dirValue;
+Value lockValue;
+Value timeValue;
+Value firstRow;
+String lastSplit;
+boolean first;
+
+public SplitEntry() {
+  dirValue = null;
+  lockValue = null;
+  timeValue = null;
+  firstRow = null;
+  lastSplit = null;
+  first = true;
+}
+  };
+
+  @Override
+  public long isReady(long tid, Master environment) throws Exception {
+return 0;
+  }
+
+  @Override
+  public Repo call(long tid, Master environment) throws Exception {
+String splitFile = getSplitFileName();
+
+// Read table entry and store values for later use
+SplitEntry splitEntry = new SplitEntry();
+populateSplitEntry(environment, splitEntry);
+
+try (BatchWriter bw = 
environment.getConnector().createBatchWriter("accumulo.metadata")) {
+
+  // Read splits from filesystem and write to metadata table.
+  writeSplitsToMetadataTable(environment, splitFile, splitEntry, bw);
+
+  // last row is handled as a special case
+  writeLastRowToMetadataTable(splitEntry, bw);
+}
+
+return new FinishCreateTable(tableInfo);
+  }
+
+  private void writeLastRowToMetadataTable(SplitEntry splitEntry, BatchWriter 
bw)
+  throws MutationsRejectedException {
+Text lastRow = new Text(tableInfo.tableId.toString() + "<");
+Mutation mut = new Mutation(lastRow);
+mut.put(new Text("srv"), new Text("dir"), splitEntry.dirValue);
 
 Review comment:
   Really need to use constants here.   In PopulateMetadata it calls 
`MetadataTableUtil.addTablet()`.  If you dig into the source of `addTablet()` 
you can see the constants its uses.   Some of the constants are quirky, let me 
know if you have any questions.  I would be happy to provide examples if needed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #575: Add splits to table at time of table creation #573

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #575: Add splits to table at 
time of table creation #573
URL: https://github.com/apache/accumulo/pull/575#discussion_r206287300
 
 

 ##
 File path: 
core/src/main/java/org/apache/accumulo/core/client/admin/NewTableConfiguration.java
 ##
 @@ -199,6 +225,21 @@ public NewTableConfiguration 
setLocalityGroups(Map> groups) {
 return this;
   }
 
+  /**
+   * Create a new table wkth pre-configured splits from the provided imput 
collection.
 
 Review comment:
   Spelling : `wkth` and `imput`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #575: Add splits to table at time of table creation #573

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #575: Add splits to table at 
time of table creation #573
URL: https://github.com/apache/accumulo/pull/575#discussion_r206286822
 
 

 ##
 File path: 
core/src/main/java/org/apache/accumulo/core/client/impl/TableOperationsImpl.java
 ##
 @@ -234,8 +234,19 @@ public void create(String tableName, 
NewTableConfiguration ntc)
 checkArgument(tableName != null, "tableName is null");
 checkArgument(ntc != null, "ntc is null");
 
-List args = 
Arrays.asList(ByteBuffer.wrap(tableName.getBytes(UTF_8)),
-ByteBuffer.wrap(ntc.getTimeType().name().getBytes(UTF_8)));
+List args = new ArrayList<>();
+args.add(ByteBuffer.wrap(tableName.getBytes(UTF_8)));
+args.add(ByteBuffer.wrap(ntc.getTimeType().name().getBytes(UTF_8)));
+// Check for possible initial splits to be added at table creation
+// Always send number of initial splits to be created, even if zero. If 
greater than zero,
+// add the splits to the argument List which will be used by the FATE 
operations.
+int numSplits = ntc.getSplits().size();
+args.add(ByteBuffer.wrap(String.valueOf(numSplits).getBytes(UTF_8)));
+if (numSplits > 0) {
+  for (Text t : ntc.getSplits()) {
+args.add(ByteBuffer.wrap(t.toString().getBytes(UTF_8)));
 
 Review comment:
   Splits may be binary data.  Converting Text to  a String and then to a 
ByteBuffer could scramble the binary data (because it encodes and then decodes 
utf-8).  I would recommend doing `args.add(TextUtil. getByteBuffer(t))`.  Using 
TextUtil properly handles the edge case where Text has a byte array thats 
longer than its length.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #575: Add splits to table at time of table creation #573

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #575: Add splits to table at 
time of table creation #573
URL: https://github.com/apache/accumulo/pull/575#discussion_r206294607
 
 

 ##
 File path: 
server/master/src/main/java/org/apache/accumulo/master/tableOps/CreateInitialSplits.java
 ##
 @@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.accumulo.master.tableOps;
+
+import java.io.BufferedReader;
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.util.Map;
+
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.BatchWriter;
+import org.apache.accumulo.core.client.MutationsRejectedException;
+import org.apache.accumulo.core.client.Scanner;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.conf.Property;
+import org.apache.accumulo.core.data.Key;
+import org.apache.accumulo.core.data.Mutation;
+import org.apache.accumulo.core.data.Range;
+import org.apache.accumulo.core.data.Value;
+import org.apache.accumulo.core.data.impl.KeyExtent;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.accumulo.fate.Repo;
+import org.apache.accumulo.master.Master;
+import org.apache.accumulo.server.zookeeper.ZooReaderWriter;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.io.Text;
+import org.apache.zookeeper.KeeperException;
+import org.apache.zookeeper.data.Stat;
+
+public class CreateInitialSplits extends MasterRepo {
+
+  private static final long serialVersionUID = 1L;
+
+  private TableInfo tableInfo;
+
+  CreateInitialSplits(TableInfo ti) {
+this.tableInfo = ti;
+  }
+
+  private class SplitEntry {
+Value dirValue;
+Value lockValue;
+Value timeValue;
+Value firstRow;
+String lastSplit;
+boolean first;
+
+public SplitEntry() {
+  dirValue = null;
+  lockValue = null;
+  timeValue = null;
+  firstRow = null;
+  lastSplit = null;
+  first = true;
+}
+  };
+
+  @Override
+  public long isReady(long tid, Master environment) throws Exception {
+return 0;
+  }
+
+  @Override
+  public Repo call(long tid, Master environment) throws Exception {
+String splitFile = getSplitFileName();
+
+// Read table entry and store values for later use
+SplitEntry splitEntry = new SplitEntry();
+populateSplitEntry(environment, splitEntry);
 
 Review comment:
   I would recommend dropping this class and rolling the functionality it 
contains into `PopulateMetadata`.  It seems like the function 
`populateSplitEntry` is reading what `PopulateMetadata` wrote.   
`PopulateMetadata` used to write a single tablet to the metadata table.   It 
could be changed to write one or more tablets.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #575: Add splits to table at time of table creation #573

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #575: Add splits to table at 
time of table creation #573
URL: https://github.com/apache/accumulo/pull/575#discussion_r206291517
 
 

 ##
 File path: 
server/master/src/main/java/org/apache/accumulo/master/tableOps/CreateTable.java
 ##
 @@ -58,6 +68,10 @@ public long isReady(long tid, Master environment) throws 
Exception {
 Utils.idLock.lock();
 try {
   tableInfo.tableId = Utils.getNextId(tableInfo.tableName, 
master.getInstance(), Table.ID::of);
+  if (tableInfo.props.containsKey(Property.TABLE_OFFLINE_OPTS + 
"create.initial.splits")
+  && this.splitFile != null) {
+storeSplitFileNameInZooKeeper();
 
 Review comment:
   I don't think this needs to be stored in ZK explicitly.  The way FATE works, 
each step of a FATE op is serialized to ZK.  Therefore if you store any info 
you want to keep around in the TableInfo object, then the TableInfo object will 
be serialized in subsequent steps.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #575: Add splits to table at time of table creation #573

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #575: Add splits to table at 
time of table creation #573
URL: https://github.com/apache/accumulo/pull/575#discussion_r206277794
 
 

 ##
 File path: core/src/main/java/org/apache/accumulo/core/conf/Property.java
 ##
 @@ -864,10 +864,11 @@
   "Prefix for configuring summarizers for a table. Using this prefix"
   + " multiple summarizers can be configured with options for each 
one. Each"
   + " summarizer configured should have a unique id, this id can be 
anything."
-  + " To add a summarizer set "
-  + "`table.summarizer.=.` If the 
summarizer has options"
-  + ", then for each option set" + " `table.summarizer..opt.=`."),
-
+  + " To add a summarizer set table.summarizer.=. If the summarizer has options, then for each option set"
+  + " table.summarizer..opt.=."),
+  TABLE_OFFLINE_OPTS("table.offline.", null, PropertyType.PREFIX,
 
 Review comment:
   Looking at how this is used it doesn't seem like something that should be a 
prefix that is documented for users (this will end up in the user manual).  It 
seems like its just for internal use for creating properties.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #575: Add splits to table at time of table creation #573

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #575: Add splits to table at 
time of table creation #573
URL: https://github.com/apache/accumulo/pull/575#discussion_r206288308
 
 

 ##
 File path: 
core/src/main/java/org/apache/accumulo/core/client/admin/NewTableConfiguration.java
 ##
 @@ -199,6 +225,21 @@ public NewTableConfiguration 
setLocalityGroups(Map> groups) {
 return this;
   }
 
+  /**
+   * Create a new table wkth pre-configured splits from the provided imput 
collection.
+   *
+   * @param splits
+   *  A SortedSet of String values to be used as split points in a 
newly created table.
+   * @return this
+   */
+  public NewTableConfiguration withSplits(final SortedSet splits) {
+checkArgument(splits != null, "splits set is null");
+checkArgument(splits.isEmpty() != true, "splits set is empty");
+this.createInitialSplits = true;
+this.splitProps = splits;
 
 Review comment:
   Personally I would do `splitProps = ImmuatbleSortedSet.copyOf(splits)` here. 
 This makes this API's behavior more predictable.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #575: Add splits to table at time of table creation #573

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #575: Add splits to table at 
time of table creation #573
URL: https://github.com/apache/accumulo/pull/575#discussion_r206290289
 
 

 ##
 File path: 
server/master/src/main/java/org/apache/accumulo/master/FateServiceHandler.java
 ##
 @@ -664,4 +670,41 @@ private String validateNamespaceArgument(ByteBuffer 
namespaceArg, TableOperation
   TableOperationExceptionType.INVALID_NAME, why);
 }
   }
+
+  /**
+   * Create a file on the file system to hold the splits to be created at 
table creation.
+   */
+  private String createSplitFile(final long opid, final List 
arguments,
+  final int splitCount, final int splitOffset) {
+String opidStr = String.format("%016x", opid);
+String splitPath = getSplitPath("/tmp/splits-" + opidStr);
+try (FSDataOutputStream stream = master.getOutputStream(splitPath)) {
+  writeSplitsToFileSystem(stream, arguments, splitCount, splitOffset);
+} catch (IOException e) {
+  splitPath = null;
+  e.printStackTrace();
 
 Review comment:
   Seems like this should log a message (with filename, error, and opid) and 
maybe rethrow the exception which should cause the create table op to fail.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #575: Add splits to table at time of table creation #573

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #575: Add splits to table at 
time of table creation #573
URL: https://github.com/apache/accumulo/pull/575#discussion_r206287413
 
 

 ##
 File path: 
core/src/main/java/org/apache/accumulo/core/client/admin/NewTableConfiguration.java
 ##
 @@ -199,6 +225,21 @@ public NewTableConfiguration 
setLocalityGroups(Map> groups) {
 return this;
   }
 
+  /**
+   * Create a new table wkth pre-configured splits from the provided imput 
collection.
+   *
+   * @param splits
+   *  A SortedSet of String values to be used as split points in a 
newly created table.
+   * @return this
 
 Review comment:
   Should add `@since 2.0.0` javadoc tag.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #575: Add splits to table at time of table creation #573

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #575: Add splits to table at 
time of table creation #573
URL: https://github.com/apache/accumulo/pull/575#discussion_r206287592
 
 

 ##
 File path: 
core/src/main/java/org/apache/accumulo/core/client/admin/NewTableConfiguration.java
 ##
 @@ -103,6 +108,16 @@ public NewTableConfiguration withoutDefaultIterators() {
 return this;
   }
 
+  /**
+   * Create the new table in an offline state.
+   *
+   * @return this
+   */
 
 Review comment:
   Should add `@since 2.0.0` javadoc tag.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #575: Add splits to table at time of table creation #573

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #575: Add splits to table at 
time of table creation #573
URL: https://github.com/apache/accumulo/pull/575#discussion_r206277033
 
 

 ##
 File path: core/src/main/java/org/apache/accumulo/core/conf/Property.java
 ##
 @@ -864,10 +864,11 @@
   "Prefix for configuring summarizers for a table. Using this prefix"
   + " multiple summarizers can be configured with options for each 
one. Each"
   + " summarizer configured should have a unique id, this id can be 
anything."
-  + " To add a summarizer set "
-  + "`table.summarizer.=.` If the 
summarizer has options"
-  + ", then for each option set" + " `table.summarizer..opt.=`."),
-
+  + " To add a summarizer set table.summarizer.=. If the summarizer has options, then for each option set"
+  + " table.summarizer..opt.=."),
 
 Review comment:
   Why remove the backtick formatting?  I think I added this recently because I 
noticed the less than and greater than chars were not rendering properly in the 
user manual.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Accumulo-Pull-Requests - Build # 1381 - Unstable

2018-07-30 Thread Apache Jenkins Server
The Apache Jenkins build system has built Accumulo-Pull-Requests (build #1381)

Status: Unstable

Check console output at 
https://builds.apache.org/job/Accumulo-Pull-Requests/1381/ to view the results.

[GitHub] keith-turner commented on a change in pull request #574: Add tool to swap out and quarantine corrupt logs

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #574: Add tool to swap out 
and quarantine corrupt logs
URL: https://github.com/apache/accumulo/pull/574#discussion_r206245096
 
 

 ##
 File path: 
server/tserver/src/main/java/org/apache/accumulo/tserver/log/CorruptWalReplacer.java
 ##
 @@ -0,0 +1,290 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.accumulo.tserver.log;
+
+import static java.lang.String.format;
+import static java.nio.charset.StandardCharsets.UTF_8;
+
+import java.net.URLEncoder;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.TreeSet;
+
+import org.apache.accumulo.core.cli.Help;
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.Connector;
+import org.apache.accumulo.core.client.RowIterator;
+import org.apache.accumulo.core.client.Scanner;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.client.ZooKeeperInstance;
+import org.apache.accumulo.core.client.security.tokens.PasswordToken;
+import org.apache.accumulo.core.conf.AccumuloConfiguration;
+import org.apache.accumulo.core.data.Key;
+import org.apache.accumulo.core.data.Mutation;
+import org.apache.accumulo.core.data.Value;
+import org.apache.accumulo.core.data.impl.KeyExtent;
+import 
org.apache.accumulo.core.metadata.schema.MetadataSchema.TabletsSection.LogColumnFamily;
+import 
org.apache.accumulo.core.metadata.schema.MetadataSchema.TabletsSection.TabletColumnFamily;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.accumulo.server.fs.VolumeManager;
+import org.apache.accumulo.server.fs.VolumeManagerImpl;
+import org.apache.accumulo.server.log.SortedLogState;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.Text;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.beust.jcommander.Parameter;
+import com.google.common.primitives.Bytes;
+
+public class CorruptWalReplacer {
+
+  private static final Logger log = 
LoggerFactory.getLogger(CorruptWalReplacer.class);
+  private static final byte[] EMPTY_WAL_CONTENT = Bytes
+  .concat("--- Log File Header (v2) ---".getBytes(UTF_8), new byte[] {0, 
0, 0, 0});
+
+  private Connector connector;
+  private String quarantineDir;
+  private String workDir;
+
+  public CorruptWalReplacer(String instanceName, String zooKeepers, String 
user, String password,
+  String quarantineDir, String workDir) throws AccumuloSecurityException, 
AccumuloException {
+ZooKeeperInstance instance = new ZooKeeperInstance(instanceName, 
zooKeepers);
+this.connector = instance.getConnector(user, new PasswordToken(password));
+this.quarantineDir = quarantineDir;
+this.workDir = workDir;
+  }
+
+  public void run() throws Exception {
+VolumeManager fs = VolumeManagerImpl.get();
+Path quarantineDir = new Path(this.quarantineDir);
+Path workDir = new Path(this.workDir);
+
+for (KeyExtent extent : getExtents()) {
+  log.info("Checking logs for extent {}", extent);
+  List logs = getLogs(extent);
+
+  if (logs.isEmpty()) {
+log.info("No logs found for key {}", extent);
+continue;
+  }
+
+  // Clean up the working directory, if it exists
+  ArrayList dirs = new ArrayList<>();
+  if (fs.exists(workDir)) {
+log.info("Deleting {}", workDir);
+fs.deleteRecursively(workDir);
+  }
+
+  try {
+// Map containing the name of the file to the full path
+Map nameToSource = new HashMap<>();
+for (Path path : logs) {
+  Path destPath = new Path(workDir, path.getName());
+
+  // Run the log sorter task to prepare for recovery
+  LogSorter.LogSorterTask task = new LogSorter.LogSorterTask(fs,
+  AccumuloConfiguration.getDefaultConfiguration());
+
+  log.info("Invoking sort from path {} to path {}", path, destPath);
+  task.sort(path.getName(), 

[GitHub] keith-turner commented on a change in pull request #574: Add tool to swap out and quarantine corrupt logs

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #574: Add tool to swap out 
and quarantine corrupt logs
URL: https://github.com/apache/accumulo/pull/574#discussion_r206243204
 
 

 ##
 File path: 
server/tserver/src/main/java/org/apache/accumulo/tserver/log/CorruptWalReplacer.java
 ##
 @@ -0,0 +1,290 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.accumulo.tserver.log;
+
+import static java.lang.String.format;
+import static java.nio.charset.StandardCharsets.UTF_8;
+
+import java.net.URLEncoder;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.TreeSet;
+
+import org.apache.accumulo.core.cli.Help;
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.Connector;
+import org.apache.accumulo.core.client.RowIterator;
+import org.apache.accumulo.core.client.Scanner;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.client.ZooKeeperInstance;
+import org.apache.accumulo.core.client.security.tokens.PasswordToken;
+import org.apache.accumulo.core.conf.AccumuloConfiguration;
+import org.apache.accumulo.core.data.Key;
+import org.apache.accumulo.core.data.Mutation;
+import org.apache.accumulo.core.data.Value;
+import org.apache.accumulo.core.data.impl.KeyExtent;
+import 
org.apache.accumulo.core.metadata.schema.MetadataSchema.TabletsSection.LogColumnFamily;
+import 
org.apache.accumulo.core.metadata.schema.MetadataSchema.TabletsSection.TabletColumnFamily;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.accumulo.server.fs.VolumeManager;
+import org.apache.accumulo.server.fs.VolumeManagerImpl;
+import org.apache.accumulo.server.log.SortedLogState;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.Text;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.beust.jcommander.Parameter;
+import com.google.common.primitives.Bytes;
+
+public class CorruptWalReplacer {
+
+  private static final Logger log = 
LoggerFactory.getLogger(CorruptWalReplacer.class);
+  private static final byte[] EMPTY_WAL_CONTENT = Bytes
+  .concat("--- Log File Header (v2) ---".getBytes(UTF_8), new byte[] {0, 
0, 0, 0});
+
+  private Connector connector;
+  private String quarantineDir;
+  private String workDir;
+
+  public CorruptWalReplacer(String instanceName, String zooKeepers, String 
user, String password,
+  String quarantineDir, String workDir) throws AccumuloSecurityException, 
AccumuloException {
+ZooKeeperInstance instance = new ZooKeeperInstance(instanceName, 
zooKeepers);
+this.connector = instance.getConnector(user, new PasswordToken(password));
+this.quarantineDir = quarantineDir;
+this.workDir = workDir;
+  }
+
+  public void run() throws Exception {
+VolumeManager fs = VolumeManagerImpl.get();
+Path quarantineDir = new Path(this.quarantineDir);
+Path workDir = new Path(this.workDir);
+
+for (KeyExtent extent : getExtents()) {
+  log.info("Checking logs for extent {}", extent);
+  List logs = getLogs(extent);
+
+  if (logs.isEmpty()) {
+log.info("No logs found for key {}", extent);
+continue;
+  }
+
+  // Clean up the working directory, if it exists
+  ArrayList dirs = new ArrayList<>();
+  if (fs.exists(workDir)) {
+log.info("Deleting {}", workDir);
+fs.deleteRecursively(workDir);
+  }
+
+  try {
+// Map containing the name of the file to the full path
+Map nameToSource = new HashMap<>();
+for (Path path : logs) {
+  Path destPath = new Path(workDir, path.getName());
+
+  // Run the log sorter task to prepare for recovery
+  LogSorter.LogSorterTask task = new LogSorter.LogSorterTask(fs,
+  AccumuloConfiguration.getDefaultConfiguration());
+
+  log.info("Invoking sort from path {} to path {}", path, destPath);
+  task.sort(path.getName(), 

[GitHub] keith-turner commented on a change in pull request #574: Add tool to swap out and quarantine corrupt logs

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #574: Add tool to swap out 
and quarantine corrupt logs
URL: https://github.com/apache/accumulo/pull/574#discussion_r206240558
 
 

 ##
 File path: 
server/tserver/src/main/java/org/apache/accumulo/tserver/log/CorruptWalReplacer.java
 ##
 @@ -0,0 +1,290 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.accumulo.tserver.log;
+
+import static java.lang.String.format;
+import static java.nio.charset.StandardCharsets.UTF_8;
+
+import java.net.URLEncoder;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.TreeSet;
+
+import org.apache.accumulo.core.cli.Help;
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.Connector;
+import org.apache.accumulo.core.client.RowIterator;
+import org.apache.accumulo.core.client.Scanner;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.client.ZooKeeperInstance;
+import org.apache.accumulo.core.client.security.tokens.PasswordToken;
+import org.apache.accumulo.core.conf.AccumuloConfiguration;
+import org.apache.accumulo.core.data.Key;
+import org.apache.accumulo.core.data.Mutation;
+import org.apache.accumulo.core.data.Value;
+import org.apache.accumulo.core.data.impl.KeyExtent;
+import 
org.apache.accumulo.core.metadata.schema.MetadataSchema.TabletsSection.LogColumnFamily;
+import 
org.apache.accumulo.core.metadata.schema.MetadataSchema.TabletsSection.TabletColumnFamily;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.accumulo.server.fs.VolumeManager;
+import org.apache.accumulo.server.fs.VolumeManagerImpl;
+import org.apache.accumulo.server.log.SortedLogState;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.Text;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.beust.jcommander.Parameter;
+import com.google.common.primitives.Bytes;
+
+public class CorruptWalReplacer {
+
+  private static final Logger log = 
LoggerFactory.getLogger(CorruptWalReplacer.class);
+  private static final byte[] EMPTY_WAL_CONTENT = Bytes
+  .concat("--- Log File Header (v2) ---".getBytes(UTF_8), new byte[] {0, 
0, 0, 0});
+
+  private Connector connector;
+  private String quarantineDir;
+  private String workDir;
+
+  public CorruptWalReplacer(String instanceName, String zooKeepers, String 
user, String password,
+  String quarantineDir, String workDir) throws AccumuloSecurityException, 
AccumuloException {
+ZooKeeperInstance instance = new ZooKeeperInstance(instanceName, 
zooKeepers);
+this.connector = instance.getConnector(user, new PasswordToken(password));
+this.quarantineDir = quarantineDir;
+this.workDir = workDir;
+  }
+
+  public void run() throws Exception {
+VolumeManager fs = VolumeManagerImpl.get();
+Path quarantineDir = new Path(this.quarantineDir);
+Path workDir = new Path(this.workDir);
+
+for (KeyExtent extent : getExtents()) {
+  log.info("Checking logs for extent {}", extent);
+  List logs = getLogs(extent);
+
+  if (logs.isEmpty()) {
+log.info("No logs found for key {}", extent);
+continue;
+  }
+
+  // Clean up the working directory, if it exists
+  ArrayList dirs = new ArrayList<>();
+  if (fs.exists(workDir)) {
+log.info("Deleting {}", workDir);
+fs.deleteRecursively(workDir);
+  }
+
+  try {
+// Map containing the name of the file to the full path
+Map nameToSource = new HashMap<>();
+for (Path path : logs) {
+  Path destPath = new Path(workDir, path.getName());
+
+  // Run the log sorter task to prepare for recovery
+  LogSorter.LogSorterTask task = new LogSorter.LogSorterTask(fs,
+  AccumuloConfiguration.getDefaultConfiguration());
+
+  log.info("Invoking sort from path {} to path {}", path, destPath);
+  task.sort(path.getName(), 

[GitHub] keith-turner commented on a change in pull request #574: Add tool to swap out and quarantine corrupt logs

2018-07-30 Thread GitBox
keith-turner commented on a change in pull request #574: Add tool to swap out 
and quarantine corrupt logs
URL: https://github.com/apache/accumulo/pull/574#discussion_r206241165
 
 

 ##
 File path: 
server/tserver/src/main/java/org/apache/accumulo/tserver/log/CorruptWalReplacer.java
 ##
 @@ -0,0 +1,290 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.accumulo.tserver.log;
+
+import static java.lang.String.format;
+import static java.nio.charset.StandardCharsets.UTF_8;
+
+import java.net.URLEncoder;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.TreeSet;
+
+import org.apache.accumulo.core.cli.Help;
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.Connector;
+import org.apache.accumulo.core.client.RowIterator;
+import org.apache.accumulo.core.client.Scanner;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.client.ZooKeeperInstance;
+import org.apache.accumulo.core.client.security.tokens.PasswordToken;
+import org.apache.accumulo.core.conf.AccumuloConfiguration;
+import org.apache.accumulo.core.data.Key;
+import org.apache.accumulo.core.data.Mutation;
+import org.apache.accumulo.core.data.Value;
+import org.apache.accumulo.core.data.impl.KeyExtent;
+import 
org.apache.accumulo.core.metadata.schema.MetadataSchema.TabletsSection.LogColumnFamily;
+import 
org.apache.accumulo.core.metadata.schema.MetadataSchema.TabletsSection.TabletColumnFamily;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.accumulo.server.fs.VolumeManager;
+import org.apache.accumulo.server.fs.VolumeManagerImpl;
+import org.apache.accumulo.server.log.SortedLogState;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.Text;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.beust.jcommander.Parameter;
+import com.google.common.primitives.Bytes;
+
+public class CorruptWalReplacer {
+
+  private static final Logger log = 
LoggerFactory.getLogger(CorruptWalReplacer.class);
+  private static final byte[] EMPTY_WAL_CONTENT = Bytes
+  .concat("--- Log File Header (v2) ---".getBytes(UTF_8), new byte[] {0, 
0, 0, 0});
+
+  private Connector connector;
+  private String quarantineDir;
+  private String workDir;
+
+  public CorruptWalReplacer(String instanceName, String zooKeepers, String 
user, String password,
+  String quarantineDir, String workDir) throws AccumuloSecurityException, 
AccumuloException {
+ZooKeeperInstance instance = new ZooKeeperInstance(instanceName, 
zooKeepers);
+this.connector = instance.getConnector(user, new PasswordToken(password));
+this.quarantineDir = quarantineDir;
+this.workDir = workDir;
+  }
+
+  public void run() throws Exception {
+VolumeManager fs = VolumeManagerImpl.get();
+Path quarantineDir = new Path(this.quarantineDir);
+Path workDir = new Path(this.workDir);
+
+for (KeyExtent extent : getExtents()) {
+  log.info("Checking logs for extent {}", extent);
+  List logs = getLogs(extent);
+
+  if (logs.isEmpty()) {
+log.info("No logs found for key {}", extent);
+continue;
+  }
+
+  // Clean up the working directory, if it exists
+  ArrayList dirs = new ArrayList<>();
+  if (fs.exists(workDir)) {
+log.info("Deleting {}", workDir);
+fs.deleteRecursively(workDir);
+  }
+
+  try {
+// Map containing the name of the file to the full path
+Map nameToSource = new HashMap<>();
+for (Path path : logs) {
+  Path destPath = new Path(workDir, path.getName());
+
+  // Run the log sorter task to prepare for recovery
+  LogSorter.LogSorterTask task = new LogSorter.LogSorterTask(fs,
+  AccumuloConfiguration.getDefaultConfiguration());
+
+  log.info("Invoking sort from path {} to path {}", path, destPath);
+  task.sort(path.getName(), 

Accumulo-Pull-Requests - Build # 1379 - Fixed

2018-07-30 Thread Apache Jenkins Server
The Apache Jenkins build system has built Accumulo-Pull-Requests (build #1379)

Status: Fixed

Check console output at 
https://builds.apache.org/job/Accumulo-Pull-Requests/1379/ to view the results.

[GitHub] jmark99 opened a new pull request #575: Add splits to table at time of table creation #573

2018-07-30 Thread GitBox
jmark99 opened a new pull request #575: Add splits to table at time of table 
creation #573
URL: https://github.com/apache/accumulo/pull/575
 
 
   This issue updates Accumulo to allow for tables to create splits at the  
time of table creation. A new step is added to the createtable FATE operations 
that handle this task if it is activated.
   
   The most significant modification is that the splits are written to the 
metadata table after the basic table information is created but before the 
table is brought online. This is handled by the CreateInitialSplits class 
within the FATE operation.
   
   Closes #573 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Accumulo-Pull-Requests - Build # 1378 - Failure

2018-07-30 Thread Apache Jenkins Server
The Apache Jenkins build system has built Accumulo-Pull-Requests (build #1378)

Status: Failure

Check console output at 
https://builds.apache.org/job/Accumulo-Pull-Requests/1378/ to view the results.

[GitHub] adamjshook opened a new pull request #574: Add tool to swap out and quarantine corrupt logs

2018-07-30 Thread GitBox
adamjshook opened a new pull request #574: Add tool to swap out and quarantine 
corrupt logs
URL: https://github.com/apache/accumulo/pull/574
 
 
   Fixes #554 
   
   Work in progress tooling to solicit feedback regarding this tool and any 
other tooling needed.  The intention here is to provide a tool for end-users to 
run whenever a WAL is unrecoverable, effectively making the tablets within the 
WAL unavailable until it is replaced with an empty version.  The tool here will 
perform the same log-sorting process that is done by the TabletServer in order 
to catch an exception being thrown.  It will then move the unrecoverable WAL to 
a given quarantine directory and write a new one in place.  I think the next 
step would be to replay the quarantined WAL(s) and recovery whatever can be 
recovered.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Comment Edited] (ACCUMULO-4808) Add splits to table at table creation.

2018-07-30 Thread Mark Owens (JIRA)


[ 
https://issues.apache.org/jira/browse/ACCUMULO-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561980#comment-16561980
 ] 

Mark Owens edited comment on ACCUMULO-4808 at 7/30/18 2:33 PM:
---

Duplicated by Github Issue [%573|https://github.com/apache/accumulo/issues/573]


was (Author: jmark99):
See Github Issue [%573|https://github.com/apache/accumulo/issues/573]

> Add splits to table at table creation.
> --
>
> Key: ACCUMULO-4808
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4808
> Project: Accumulo
>  Issue Type: New Feature
>  Components: master, tserver
>Reporter: Mark Owens
>Assignee: Mark Owens
>Priority: Major
> Fix For: 2.0.0
>
>
> Add capability to add table splits at table creation. Recent changes now 
> allow iterator and locality groups to be created at table creation. Do the 
> same with splits. Comment below from 
> [ACCUMULO-4806|https://issues.apache.org/jira/browse/ACCUMULO-4806] explains 
> the motivation for the request:
> {quote}[~etcoleman] added a comment - 2 hours ago
> It would go al long way if the splits could be added at table creation or 
> when table is offline.  When the other API changes were made by Mark, I 
> wondered if this task could also could be done at that time - but I believe 
> that it was more complicated.
> The delay is that when a table is created and then the splits added and then 
> taken offline there is a period proportional to the number of splits as they 
> are off-loaded from the tserver where they originally got assigned.  (The 
> re-online with splits distributed across the cluster is quite fast)
> If the splits could be added at table creation, or while the table is offline 
> so that the delay for shedding the tablets could be avoided, then the need to 
> perform the actual import offline would not be as necessary.
>  
> {quote}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4808) Add splits to table at table creation.

2018-07-30 Thread Mark Owens (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Owens resolved ACCUMULO-4808.
--
Resolution: Duplicate

See Github Issue [%573|https://github.com/apache/accumulo/issues/573]

> Add splits to table at table creation.
> --
>
> Key: ACCUMULO-4808
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4808
> Project: Accumulo
>  Issue Type: New Feature
>  Components: master, tserver
>Reporter: Mark Owens
>Assignee: Mark Owens
>Priority: Major
> Fix For: 2.0.0
>
>
> Add capability to add table splits at table creation. Recent changes now 
> allow iterator and locality groups to be created at table creation. Do the 
> same with splits. Comment below from 
> [ACCUMULO-4806|https://issues.apache.org/jira/browse/ACCUMULO-4806] explains 
> the motivation for the request:
> {quote}[~etcoleman] added a comment - 2 hours ago
> It would go al long way if the splits could be added at table creation or 
> when table is offline.  When the other API changes were made by Mark, I 
> wondered if this task could also could be done at that time - but I believe 
> that it was more complicated.
> The delay is that when a table is created and then the splits added and then 
> taken offline there is a period proportional to the number of splits as they 
> are off-loaded from the tserver where they originally got assigned.  (The 
> re-online with splits distributed across the cluster is quite fast)
> If the splits could be added at table creation, or while the table is offline 
> so that the delay for shedding the tablets could be avoided, then the need to 
> perform the actual import offline would not be as necessary.
>  
> {quote}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ACCUMULO-4808) Add splits to table at table creation.

2018-07-30 Thread Mark Owens (JIRA)


[ 
https://issues.apache.org/jira/browse/ACCUMULO-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561977#comment-16561977
 ] 

Mark Owens commented on ACCUMULO-4808:
--

This issue is not represented by the GitHub issue 
[#573|https://github.com/apache/accumulo/issues/573]

> Add splits to table at table creation.
> --
>
> Key: ACCUMULO-4808
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4808
> Project: Accumulo
>  Issue Type: New Feature
>  Components: master, tserver
>Reporter: Mark Owens
>Assignee: Mark Owens
>Priority: Major
> Fix For: 2.0.0
>
>
> Add capability to add table splits at table creation. Recent changes now 
> allow iterator and locality groups to be created at table creation. Do the 
> same with splits. Comment below from 
> [ACCUMULO-4806|https://issues.apache.org/jira/browse/ACCUMULO-4806] explains 
> the motivation for the request:
> {quote}[~etcoleman] added a comment - 2 hours ago
> It would go al long way if the splits could be added at table creation or 
> when table is offline.  When the other API changes were made by Mark, I 
> wondered if this task could also could be done at that time - but I believe 
> that it was more complicated.
> The delay is that when a table is created and then the splits added and then 
> taken offline there is a period proportional to the number of splits as they 
> are off-loaded from the tserver where they originally got assigned.  (The 
> re-online with splits distributed across the cluster is quite fast)
> If the splits could be added at table creation, or while the table is offline 
> so that the delay for shedding the tablets could be avoided, then the need to 
> perform the actual import offline would not be as necessary.
>  
> {quote}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] jmark99 opened a new issue #573: Add splits to table at table creation.

2018-07-30 Thread GitBox
jmark99 opened a new issue #573: Add splits to table at table creation.
URL: https://github.com/apache/accumulo/issues/573
 
 
   Add capability to add table splits at table creation. Recent changes now 
allow iterator and locality groups to be created at table creation. Do the same 
with splits. See comment from [ACCUMULO-4806 
](https://issues.apache.org/jira/browse/ACCUMULO-4808?filter=-1 )for an 
explanation of the motivation for this request.
   
   Linked to JIRA ticket 
[ACCUMULO-4808](https://issues.apache.org/jira/browse/ACCUMULO-4808?filter=-1)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services