[ 
https://issues.apache.org/jira/browse/BLUR-107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13666958#comment-13666958
 ] 

Gagan Deep Juneja commented on BLUR-107:
----------------------------------------

I tried below implementation for validation 

public void checkOutputSpecs(JobContext context) throws IOException, 
InterruptedException {
    Configuration config = context.getConfiguration();
    TableDescriptor tableDescriptor = getTableDescriptor(config);
    if (tableDescriptor == null) {
      throw new IOException("setTableDescriptor needs to be called first.");
    }
    int shardCountByTableDescriptor = tableDescriptor.getShardCount();
    Path outputPath = getOutputPath(config);
    FileSystem fileSystem = outputPath.getFileSystem(config);
    int shardCountOnFileSystem = 0;
    for (FileStatus fileStatus : fileSystem.listStatus(outputPath)) {
      if 
(fileStatus.getPath().getName().startsWith(BlurConstants.SHARD_PREFIX)) {
        shardCountOnFileSystem++;
      }
    }
    if( shardCountOnFileSystem != shardCountByTableDescriptor){
      throw new IOException("Actual shards on file system " + "[ " + 
shardCountOnFileSystem + " ]" + " differs from the value set in Table 
Descriptor " + "[ " +shardCountByTableDescriptor +" ]");
    }
    int reducers = context.getNumReduceTasks();
    if(reducers < 0 ){
      
    }
  }

What is the definition of valid integer in case of number of reducers? Is there 
any formula for that?

Regards,
Gagan
                
> Create checkOutputSpecs check in the BlurOutputFormat
> -----------------------------------------------------
>
>                 Key: BLUR-107
>                 URL: https://issues.apache.org/jira/browse/BLUR-107
>             Project: Apache Blur
>          Issue Type: Bug
>    Affects Versions: 0.1.5
>            Reporter: Aaron McCurry
>             Fix For: 0.1.5
>
>
> Using the checkOutputSpecs method on the BlurOutputFormat validate that the 
> number of reducers is valid number.  Also check that the Blur table exists 
> and that the number of shards in the FS match the table descriptor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to