paulzanmei opened a new issue #1724:
URL: https://github.com/apache/iceberg/issues/1724


   I call the row-level delete API to delete a single record, and a null 
pointer exception occurs when calling the filter condition query API
   
   Step#1: Delete data based on conditions
   
   ```
   HadoopCatalog catalog = new HadoopCatalog(conf, warehousePath);
   
   Table table = catalog.loadTable(TableIdentifier.of("test","customers_v2_5"));
   
   Schema deleteRowSchema = table.schema().select("customer_id");
   
   Record dataDelete = GenericRecord.create(deleteRowSchema);
   
   Long id = 16211L;
   List<Record> dataDeletes = Lists.newArrayList(
        dataDelete.copy("customer_id", id)
   );
   
   Map<String, String> props = table.properties();
   FileFormat fileFormat = getFileFormat(props);
   OutputFileFactory fileFactory = new OutputFileFactory(table.spec(), 
fileFormat, table.locationProvider(), table.io(), table.encryption(), 1, 1);
   
   OutputFile temp = fileFactory.newOutputFile().encryptingOutputFile();
   
   EqualityDeleteWriter<Record> writer = Parquet.writeDeletes(temp)
                .forTable(table)
           .withPartition(null)
           .rowSchema(deleteRowSchema)
           .createWriterFunc(GenericParquetWriter::buildWriter)
           .overwrite()
           
.equalityFieldIds(deleteRowSchema.columns().stream().mapToInt(Types.NestedField::fieldId).toArray())
           .buildEqualityWriter();
   
   try (Closeable toClose = writer) {
        writer.writeAll(dataDeletes);
   }
                
   DeleteFile eqDeletes = writer.toContentFile();
                
   table.newRowDelta()
   .addDeletes(eqDeletes)
   .commit();
   ```
   execution succeed
   
   Step#2:  Conditional query failed
   
   ```
   HadoopCatalog catalog = new HadoopCatalog(conf, warehousePath);
   Table table = catalog.loadTable(TableIdentifier.of("test","customers_v2_5"));
   Iterable<Record> results = 
IcebergGenerics.read(table).where(Expressions.equal("customer_id", 
16219)).build();
   for (Record record : results) {
        System.out.println(record.toString());
   }
   ```
   
   ```
   java.lang.NullPointerException
        at 
org.apache.iceberg.ManifestReader.requireStatsProjection(ManifestReader.java:274)
        at org.apache.iceberg.ManifestReader.entries(ManifestReader.java:175)
        at 
org.apache.iceberg.ManifestReader.liveEntries(ManifestReader.java:215)
        at 
org.apache.iceberg.DeleteFileIndex$Builder.lambda$deleteManifestReaders$11(DeleteFileIndex.java:452)
        at 
org.apache.iceberg.relocated.com.google.common.collect.Iterators$6.transform(Iterators.java:783)
        at 
org.apache.iceberg.relocated.com.google.common.collect.TransformedIterator.next(TransformedIterator.java:47)
        at org.apache.iceberg.util.Tasks$Builder.runParallel(Tasks.java:301)
        at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:195)
        at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:189)
        at 
org.apache.iceberg.DeleteFileIndex$Builder.build(DeleteFileIndex.java:357)
        at org.apache.iceberg.ManifestGroup.planFiles(ManifestGroup.java:165)
        at org.apache.iceberg.DataTableScan.planFiles(DataTableScan.java:89)
        at org.apache.iceberg.BaseTableScan.planFiles(BaseTableScan.java:211)
        at org.apache.iceberg.DataTableScan.planFiles(DataTableScan.java:28)
        at org.apache.iceberg.BaseTableScan.planTasks(BaseTableScan.java:244)
        at org.apache.iceberg.DataTableScan.planTasks(DataTableScan.java:28)
        at 
org.apache.iceberg.data.TableScanIterable.<init>(TableScanIterable.java:36)
        at 
org.apache.iceberg.data.IcebergGenerics$ScanBuilder.build(IcebergGenerics.java:91)
        at org.datacloud.flinksql.hive.HiveTest.testFilter(HiveTest.java:97)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
        at 
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:89)
        at 
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:41)
        at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:541)
        at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:763)
        at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:463)
        at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:209)
   ```
   
    
   If query without conditions:
   
   ```
   Iterable<Record> results = IcebergGenerics.read(table).build();
   ```
   Can query all data and filter the deleted data
   
   May I ask what is the problem, am I using it wrong?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to