Re: Extending spark datasource

Dave Sugden Fri, 13 Sep 2019 11:02:56 -0700

On Fri, Sep 13, 2019 at 1:47 PM Ryan Blue <[email protected]> wrote:


> Hi Dave,
>
> I'm sure we can get this working, but I'd like to understand what you're
> trying to do a bit better.
>
> Why do you need atomic rename? Iceberg is set up to write data in place
> and not move or rename files. Committing those files to a table is an
> atomic operation instead. Everything should work with GCS without
> modification as far as I know, unless you don't want to use the Hadoop
> FileSystem APIs.
>
>
There is no native atomic rename in GCS, it requires a move + delete. From
the page https://iceberg.apache.org/spec/#mvcc-and-optimistic-concurrency :

"Tables do not require rename, except for tables that use atomic rename to
implement the commit operation for new metadata files."

This ^ is what we are addressing ^. ie. when a snapshot commit occurs and
the tmp metadata file is renamed to the snapshot metadata file.

>From HadoopTableOperations.java L#248:

 /**
   * Renames the source file to destination, using the provided file
system. If the rename failed,
   * an attempt will be made to delete the source file.
   *
   * @param fs the filesystem used for the rename
   * @param src the source file
   * @param dst the destination file
   */
  private void renameToFinal(FileSystem fs, Path src, Path dst) {
    try {
      if (!*fs.rename*(src, dst)) {


The above is called in commit, and AFAIK comes with the assumption that the
FileSystem.rename() is atomic... ?



> Keeping file appenders open using a write property or a table property
> sounds like a good idea to me. I wouldn't want this to be the default for
> batch writes, but I think it may make sense as an option for streaming
> writes. I'd prefer to add these features to the existing streaming writer
> instead of allowing users to use their own custom writer. Are there other
> reasons to replace the writer instead of making this behavior configurable?
>

Nope, that was the only reason. That is fine then if this could be
supported for streaming writes.

>

Re: Extending spark datasource

Reply via email to