[ 
https://issues.apache.org/jira/browse/BEAM-5959?focusedWorklogId=175993&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-175993
 ]

ASF GitHub Bot logged work on BEAM-5959:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 17/Dec/18 12:40
            Start Date: 17/Dec/18 12:40
    Worklog Time Spent: 10m 
      Work Description: lgajowy commented on a change in pull request #7266: 
[BEAM-5959] Add performance testing for writing many files
URL: https://github.com/apache/beam/pull/7266#discussion_r242131272
 
 

 ##########
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java
 ##########
 @@ -758,7 +758,10 @@ final void moveToOutputFiles(
       }
       // During a failure case, files may have been deleted in an earlier 
step. Thus
       // we ignore missing files here.
+      long startTime = System.nanoTime();
       FileSystems.rename(srcFiles, dstFiles, 
StandardMoveOptions.IGNORE_MISSING_FILES);
+      long endTime = System.nanoTime();
+      LOG.info("Renamed {} files in {} seconds.", srcFiles.size(), (endTime - 
startTime) / 1e9);
 
 Review comment:
   I don't know an easy way yet. After a better look I noticed that we see only 
Perfkit logs in Jenkins dashboard and logs from Gradle (that is run by perfkit) 
are omitted. This is a separate issue imo. 
   
   Are you fine for now with having this printed only locally? 
   
   If you're not fine then: 
    1. we should fix logs in ioits first
    2. use Metrics API and publish this metric to a separate BQ table. This 
would be concise with future plans regarding IO tests - we plan to leverage 
metrics api for all metrics instead of relying on perfkit timers only. See 
LoadTest.java for similar approach (time and total bytes collection and then 
publishing to BQ): 
https://github.com/apache/beam/blob/bd5bbf9e1eca68f4f743da4060846f2df27d81f5/sdks/java/testing/load-tests/src/main/java/org/apache/beam/sdk/loadtests/LoadTest.java#L78
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 175993)
    Time Spent: 7h 10m  (was: 7h)

> Add Cloud KMS support to GCS copies
> -----------------------------------
>
>                 Key: BEAM-5959
>                 URL: https://issues.apache.org/jira/browse/BEAM-5959
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-gcp, sdk-py-core
>            Reporter: Udi Meiri
>            Assignee: Udi Meiri
>            Priority: Major
>          Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Beam SDK currently uses the CopyTo GCS API call, which doesn't support 
> copying objects that Customer Managed Encryption Keys (CMEK).
> CMEKs are managed in Cloud KMS.
> Items (for Java and Python SDKs):
> - Update clients to versions that support KMS keys.
> - Change copyTo API calls to use rewriteTo (Python - directly, Java - 
> possibly convert copyTo API call to use client library)
> - Add unit tests.
> - Add basic tests (DirectRunner and GCS buckets with CMEK).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to