ableegoldman opened a new issue, #16836:
URL: https://github.com/apache/datafusion/issues/16836
### Describe the bug
Hi, I've recently started using DataFusion and have run into an issue trying
to copy some results into a local cache implemented using the Memtable. Here is
the code:
```
// execute the query and get the DataFrame:
let df = self
.ctx
.execute_logical_plan(plan.clone())
.await
.map_err(anyhow::Error::msg)?;
// print total number of rows
info!("Number of DB results {}", df.clone().count().await?);
// insert into the local table
let plan = cached_table.provider
.insert_into(
&self.ctx.state(),
df.clone().create_physical_plan().await?,
InsertOp::Append,
)
.await?;
let task_ctx = self.ctx.task_ctx();
let mut stream = plan.execute(0, task_ctx)?;
// print number of rows inserted
while let Some(batch) = stream.try_next().await? {
let rows = batch.column_by_name("count");
info!(
"Inserted {:?} rows into cache table {}",
rows,
id
);
}
```
There are 731 result rows, but every time I run this only 87 rows are
inserted into the MemTable/cache. I've confirmed this is the accurate count of
rows inserted because some later code scans this cache and indeed finds only 87
rows. The number 87 is consistent for me, with the 731 original rows, but
varies slightly depending on how many total rows there are -- for example, my
teammate had 751 rows in his backing db, and saw it repeatedly insert only 90
rows instead of 87.
Any idea why only a small subset of these rows are being inserted? There is
only 1 partition btw (according to
`plan.output_partitioning().partition_count())`)
### To Reproduce
_No response_
### Expected behavior
_No response_
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]