quanzhian commented on issue #1733:
URL: 
https://github.com/apache/incubator-seatunnel/issues/1733#issuecomment-1110632165

   > > @BenJFan There is an error in the decompression code. The repair code is 
as follows
   > > find class org.apache.seatunnel.utils.CompressionUtils
   > > Fixed code
   > > ```
   > >     /**
   > >      * Untar an input file into an output file.
   > >      * <p>
   > >      * The output file is created in the output folder, having the same 
name
   > >      * as the input file, minus the '.tar' extension.
   > >      *
   > >      * @param inputFile the input .tar file
   > >      * @param outputDir the output directory file.
   > >      * @throws IOException           io exception
   > >      * @throws FileNotFoundException file not found exception
   > >      * @throws ArchiveException      archive exception
   > >      */
   > >     public static void unTar(final File inputFile, final File outputDir) 
throws  IOException, ArchiveException {
   > > 
   > >         LOGGER.info("Untaring {} to dir {}.", 
inputFile.getAbsolutePath(), outputDir.getAbsolutePath());
   > > 
   > >         final List<File> untaredFiles = new LinkedList<>();
   > >         try (final InputStream is = new FileInputStream(inputFile);
   > >              final TarArchiveInputStream debInputStream = 
(TarArchiveInputStream) new 
ArchiveStreamFactory().createArchiveInputStream("tar", is)) {
   > >             TarArchiveEntry entry = null;
   > >             while ((entry = (TarArchiveEntry) 
debInputStream.getNextEntry()) != null) {
   > >                 final File outputFile = new File(outputDir, 
entry.getName()).toPath().normalize().toFile();
   > >                 if (entry.isDirectory()) {
   > >                     LOGGER.info("Attempting to write output directory 
{}.", outputFile.getAbsolutePath());
   > >                     if (!outputFile.exists()) {
   > >                         LOGGER.info("Attempting to create output 
directory {}.", outputFile.getAbsolutePath());
   > >                         if (!outputFile.mkdirs()) {
   > >                             throw new 
IllegalStateException(String.format("Couldn't create directory %s.", 
outputFile.getAbsolutePath()));
   > >                         }
   > >                     }
   > >                 } else {
   > >                     LOGGER.info("Creating output file {}.", 
outputFile.getAbsolutePath());
   > >                     File outputParentFile = outputFile.getParentFile();
   > >                     if (outputParentFile != null && 
!outputParentFile.exists()) {
   > >                         outputParentFile.mkdirs();
   > >                     }
   > >                     final OutputStream outputFileStream = new 
FileOutputStream(outputFile);
   > >                     IOUtils.copy(debInputStream, outputFileStream);
   > >                     outputFileStream.close();
   > >                 }
   > >                 untaredFiles.add(outputFile);
   > >             }
   > >         }
   > >     }
   > > ```
   > > 
   > > 
   > >     
   > >       
   > >     
   > > 
   > >       
   > >     
   > > 
   > >     
   > >   
   > > old code (There is an incorrect code)
   > > ```
   > >    /**
   > >      * Untar an input file into an output file.
   > >      * <p>
   > >      * The output file is created in the output folder, having the same 
name
   > >      * as the input file, minus the '.tar' extension.
   > >      *
   > >      * @param inputFile the input .tar file
   > >      * @param outputDir the output directory file.
   > >      * @throws IOException           io exception
   > >      * @throws FileNotFoundException file not found exception
   > >      * @throws ArchiveException      archive exception
   > >      */
   > >     public static void unTar(final File inputFile, final File outputDir) 
throws  IOException, ArchiveException {
   > > 
   > >         LOGGER.info("Untaring {} to dir {}.", 
inputFile.getAbsolutePath(), outputDir.getAbsolutePath());
   > > 
   > >         final List<File> untaredFiles = new LinkedList<>();
   > >         try (final InputStream is = new FileInputStream(inputFile);
   > >              final TarArchiveInputStream debInputStream = 
(TarArchiveInputStream) new 
ArchiveStreamFactory().createArchiveInputStream("tar", is)) {
   > >             TarArchiveEntry entry = null;
   > >             while ((entry = (TarArchiveEntry) 
debInputStream.getNextEntry()) != null) {
   > >                 final File outputFile = new File(outputDir, 
entry.getName());
   > >                 if 
(!outputFile.toPath().normalize().startsWith(outputDir.toPath())) {
   > >                     throw new IllegalStateException("Bad zip entry");
   > >                 }
   > >                 if (entry.isDirectory()) {
   > >                     LOGGER.info("Attempting to write output directory 
{}.", outputFile.getAbsolutePath());
   > >                     if (!outputFile.exists()) {
   > >                         LOGGER.info("Attempting to create output 
directory {}.", outputFile.getAbsolutePath());
   > >                         if (!outputFile.mkdirs()) {
   > >                             throw new 
IllegalStateException(String.format("Couldn't create directory %s.", 
outputFile.getAbsolutePath()));
   > >                         }
   > >                     }
   > >                 } else {
   > >                     LOGGER.info("Creating output file {}.", 
outputFile.getAbsolutePath());
   > >                     final OutputStream outputFileStream = new 
FileOutputStream(outputFile);
   > >                     IOUtils.copy(debInputStream, outputFileStream);
   > >                     outputFileStream.close();
   > >                 }
   > >                 untaredFiles.add(outputFile);
   > >             }
   > >         }
   > >     }
   > > ```
   > > 
   > > 
   > >     
   > >       
   > >     
   > > 
   > >       
   > >     
   > > 
   > >     
   > >   
   > > Here are my test details
   > > ```
   > > [xxxxxx@bigdata-app03 apache-seatunnel-incubating-2.1.1-SNAPSHOT]# 
./bin/start-seatunnel-spark.sh --master yarn --deploy-mode cluster --config 
/mnt/services/seatunnel/spark_batch.conf
   > > 22/04/27 14:33:43 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
   > > 22/04/27 14:33:44 WARN DomainSocketFactory: The short-circuit local 
reads feature cannot be used because libhadoop cannot be loaded.
   > > 22/04/27 14:33:44 INFO EsServiceCredentialProvider: Loaded 
EsServiceCredentialProvider
   > > 22/04/27 14:33:44 INFO Client: Requesting a new application from cluster 
with 5 NodeManagers
   > > 22/04/27 14:33:44 INFO Configuration: found resource resource-types.xml 
at file:/etc/hadoop/3.1.4.0-315/0/resource-types.xml
   > > 22/04/27 14:33:44 INFO Client: Verifying our application has not 
requested more than the maximum memory capability of the cluster (6144 MB per 
container)
   > > 22/04/27 14:33:44 INFO Client: Will allocate AM container, with 1408 MB 
memory including 384 MB overhead
   > > 22/04/27 14:33:44 INFO Client: Setting up container launch context for 
our AM
   > > 22/04/27 14:33:44 INFO Client: Setting up the launch environment for our 
AM container
   > > 22/04/27 14:33:44 INFO Client: Preparing resources for our AM container
   > > 22/04/27 14:33:45 INFO EsServiceCredentialProvider: Hadoop Security 
Enabled = [false]
   > > 22/04/27 14:33:45 INFO EsServiceCredentialProvider: ES Auth Method = 
[SIMPLE]
   > > 22/04/27 14:33:45 INFO EsServiceCredentialProvider: Are creds required = 
[false]
   > > 22/04/27 14:33:45 INFO Client: Source and destination file systems are 
the same. Not copying 
hdfs:/hdp/apps/3.1.4.0-315/spark2/spark2-hdp-yarn-archive.tar.gz
   > > 22/04/27 14:33:45 INFO Client: Uploading resource 
file:/mnt/services/seatunnel/apache-seatunnel-incubating-2.1.1-SNAPSHOT/lib/seatunnel-core-spark.jar
 -> 
hdfs://nameservice1/user/xxx_user/.sparkStaging/application_1643094720025_42454/seatunnel-core-spark.jar
   > > 22/04/27 14:33:46 INFO Client: Uploading resource 
file:/mnt/services/seatunnel/apache-seatunnel-incubating-2.1.1-SNAPSHOT/plugins.tar.gz
 -> 
hdfs://nameservice1/user/xxx_user/.sparkStaging/application_1643094720025_42454/plugins.tar.gz
   > > 22/04/27 14:33:46 INFO Client: Uploading resource 
file:/mnt/services/seatunnel/spark_batch.conf -> 
hdfs://nameservice1/user/xxx_user/.sparkStaging/application_1643094720025_42454/spark_batch.conf
   > > 22/04/27 14:33:46 INFO Client: Uploading resource 
file:/tmp/spark-5d399c9e-df19-4881-8a0b-67dd57f3f6c2/__spark_conf__1201408946509169751.zip
 -> 
hdfs://nameservice1/user/xxx_user/.sparkStaging/application_1643094720025_42454/__spark_conf__.zip
   > > 22/04/27 14:33:46 INFO SecurityManager: Changing view acls to: 
xxxxxx,xxx_user
   > > 22/04/27 14:33:46 INFO SecurityManager: Changing modify acls to: 
xxxxxx,xxx_user
   > > 22/04/27 14:33:46 INFO SecurityManager: Changing view acls groups to: 
   > > 22/04/27 14:33:46 INFO SecurityManager: Changing modify acls groups to: 
   > > 22/04/27 14:33:46 INFO SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users  with view permissions: Set(xxxxxx, 
xxx_user); groups with view permissions: Set(); users  with modify permissions: 
Set(xxxxxx, xxx_user); groups with modify permissions: Set()
   > > 22/04/27 14:33:46 INFO Client: Submitting application 
application_1643094720025_42454 to ResourceManager
   > > 22/04/27 14:33:46 INFO YarnClientImpl: Submitted application 
application_1643094720025_42454
   > > 22/04/27 14:33:47 INFO Client: Application report for 
application_1643094720025_42454 (state: ACCEPTED)
   > > 22/04/27 14:33:47 INFO Client: 
   > >   client token: N/A
   > >   diagnostics: AM container is launched, waiting for AM container to 
Register with RM
   > >   ApplicationMaster host: N/A
   > >   ApplicationMaster RPC port: -1
   > >   queue: default
   > >   start time: 1651041226887
   > >   final status: UNDEFINED
   > >   tracking URL: 
http://bigdata-master01:8088/proxy/application_1643094720025_42454/
   > >   user: xxx_user
   > > 22/04/27 14:33:48 INFO Client: Application report for 
application_1643094720025_42454 (state: ACCEPTED)
   > > 22/04/27 14:33:49 INFO Client: Application report for 
application_1643094720025_42454 (state: ACCEPTED)
   > > 22/04/27 14:33:50 INFO Client: Application report for 
application_1643094720025_42454 (state: ACCEPTED)
   > > 22/04/27 14:33:51 INFO Client: Application report for 
application_1643094720025_42454 (state: RUNNING)
   > > 22/04/27 14:33:51 INFO Client: 
   > >   client token: N/A
   > >   diagnostics: N/A
   > >   ApplicationMaster host: 172.18.247.16
   > >   ApplicationMaster RPC port: 0
   > >   queue: default
   > >   start time: 1651041226887
   > >   final status: UNDEFINED
   > >   tracking URL: 
http://bigdata-master01:8088/proxy/application_1643094720025_42454/
   > >   user: xxx_user
   > > 22/04/27 14:33:52 INFO Client: Application report for 
application_1643094720025_42454 (state: RUNNING)
   > > 22/04/27 14:33:53 INFO Client: Application report for 
application_1643094720025_42454 (state: RUNNING)
   > > 22/04/27 14:33:54 INFO Client: Application report for 
application_1643094720025_42454 (state: RUNNING)
   > > 22/04/27 14:33:55 INFO Client: Application report for 
application_1643094720025_42454 (state: RUNNING)
   > > 22/04/27 14:33:56 INFO Client: Application report for 
application_1643094720025_42454 (state: RUNNING)
   > > 22/04/27 14:33:57 INFO Client: Application report for 
application_1643094720025_42454 (state: RUNNING)
   > > 22/04/27 14:33:58 INFO Client: Application report for 
application_1643094720025_42454 (state: RUNNING)
   > > 22/04/27 14:33:59 INFO Client: Application report for 
application_1643094720025_42454 (state: RUNNING)
   > > 22/04/27 14:34:00 INFO Client: Application report for 
application_1643094720025_42454 (state: FINISHED)
   > > 22/04/27 14:34:00 INFO Client: 
   > >   client token: N/A
   > >   diagnostics: N/A
   > >   ApplicationMaster host: 172.18.247.16
   > >   ApplicationMaster RPC port: 0
   > >   queue: default
   > >   start time: 1651041226887
   > >   final status: SUCCEEDED
   > >   tracking URL: 
http://bigdata-master01:8088/proxy/application_1643094720025_42454/
   > >   user: xxx_user
   > > 22/04/27 14:34:00 INFO Client: Deleted staging directory 
hdfs://nameservice1/user/xxx_user/.sparkStaging/application_1643094720025_42454
   > > 22/04/27 14:34:00 INFO ShutdownHookManager: Shutdown hook called
   > > 22/04/27 14:34:00 INFO ShutdownHookManager: Deleting directory 
/tmp/spark-5d399c9e-df19-4881-8a0b-67dd57f3f6c2
   > > 22/04/27 14:34:00 INFO ShutdownHookManager: Deleting directory 
/tmp/spark-121ad009-6b38-468d-a4eb-a5faf4dbb28d
   > > ```
   > 
   > @quanzhian Can you create an PR to fix that? Welcome to contributer family!
   
   ok


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to