[jira] [Updated] (NIFI-12192) GitHub Action Workflows could better report Tests Results and Summaries

2023-10-29 Thread Chris Sampson (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Sampson updated NIFI-12192:
-
Description: 
Several of the GitHub Action Workflows produce {{JUnit}} XML result output 
files through {{surefire}} and {{failsafe}} plugins. These are currently 
uploaded as raw logs/XML files for people to download from GitHub, but it 
should be possible to upload these results as {{checks}} within GitHub instead, 
which would be easier for people to consume and understand what tests are 
failing in their PRs.

One possible action to use would be 
[test-reporter|https://github.com/marketplace/actions/test-reporter], although 
an attempt to use this in NIFI-12177 as part of the new {{integration-tests}} 
Workflow failed because it seems we're unable to {{write}} to the GitHub repo 
(even with {code:yaml}permissions.checks: write{code} set within the Workflow 
definition). This might be an intentional setting, and could require discussion 
with the ASF Infrastructure team to understand whether this is/could be 
permitted.

Also consider adding test coverage reporting, at least for one of the 
ci-integration jobs - https://github.com/marketplace/actions/jacoco-report

  was:
Several of the GitHub Action Workflows produce {{JUnit}} XML result output 
files through {{surefire}} and {{failsafe}} plugins. These are currently 
uploaded as raw logs/XML files for people to download from GitHub, but it 
should be possible to upload these results as {{checks}} within GitHub instead, 
which would be easier for people to consume and understand what tests are 
failing in their PRs.

One possible action to use would be 
[test-reporter|https://github.com/marketplace/actions/test-reporter], although 
an attempt to use this in NIFI-12177 as part of the new {{integration-tests}} 
Workflow failed because it seems we're unable to {{write}} to the GitHub repo 
(even with {code:yaml}permissions.checks: write{code} set within the Workflow 
definition). This might be an intentional setting, and could require discussion 
with the ASF Infrastructure team to understand whether this is/could be 
permitted.


> GitHub Action Workflows could better report Tests Results and Summaries
> ---
>
> Key: NIFI-12192
> URL: https://issues.apache.org/jira/browse/NIFI-12192
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Chris Sampson
>Priority: Minor
> Fix For: 2.latest
>
>
> Several of the GitHub Action Workflows produce {{JUnit}} XML result output 
> files through {{surefire}} and {{failsafe}} plugins. These are currently 
> uploaded as raw logs/XML files for people to download from GitHub, but it 
> should be possible to upload these results as {{checks}} within GitHub 
> instead, which would be easier for people to consume and understand what 
> tests are failing in their PRs.
> One possible action to use would be 
> [test-reporter|https://github.com/marketplace/actions/test-reporter], 
> although an attempt to use this in NIFI-12177 as part of the new 
> {{integration-tests}} Workflow failed because it seems we're unable to 
> {{write}} to the GitHub repo (even with {code:yaml}permissions.checks: 
> write{code} set within the Workflow definition). This might be an intentional 
> setting, and could require discussion with the ASF Infrastructure team to 
> understand whether this is/could be permitted.
> Also consider adding test coverage reporting, at least for one of the 
> ci-integration jobs - https://github.com/marketplace/actions/jacoco-report



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-8932) Add feature to CSVReader to skip N lines at top of the file

2023-10-29 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-8932:
---
Status: Patch Available  (was: In Progress)

> Add feature to CSVReader to skip N lines at top of the file
> ---
>
> Key: NIFI-8932
> URL: https://issues.apache.org/jira/browse/NIFI-8932
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Philipp Korniets
>Assignee: Matt Burgess
>Priority: Minor
> Fix For: 1.latest, 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We have a lot of CSV files where provider add custom header/footer to valid 
> CSV content.
>  CSV header is actually second row. 
> To remove unnecessary data we can use
>  * ReplaceText 
>  * splitText->RouteOnAttribute -> MergeContent
> It would be great to have an option in CSVReader controller to skip N rows 
> from top/bottom in order to get5 clean data.
>  * skip N from the top
>  * skip M from the bottom
>  Similar request was developed in FLINK 
> https://issues.apache.org/jira/browse/FLINK-1002
>  
> Data Example:
> {code}
> 7/20/21 2:48:47 AM GMT-04:00  ABB: Blended Rate Calc (X),,,
> distribution_id,Distribution 
> Id,settle_date,group_code,company_name,currency_code,common_account_name,business_date,prod_code,security,class,asset_type
> -1,all,20210719,Repo 21025226,qwerty                                    
> ,EUR,TPSL_21025226   ,19-Jul-21,BRM96ST7   ,ABC 
> 14/09/24,NR,BOND  
> -1,all,20210719,Repo 21025226,qwerty                                    
> ,GBP,RPSS_21025226   ,19-Jul-21,,Total @ -0.11,,
> {code}
> |7/20/21 2:48:47 AM GMT-04:00  ABB: Blended Rate Calc (X)|  |  |  |  |  |  |  
> |  |  |  |  |  
> |distribution_id|Distribution 
> Id|settle_date|group_code|company_name|currency_code|common_account_name|business_date|prod_code|security|class|asset_type|
> |-1|all|20210719|Repo 21025226|qwerty                                    
> |EUR|TPSL_21025226   |19-Jul-21|BRM96ST7   |ABC 
> 14/09/24|NR|BOND  |
> |-1|all|20210719|Repo 21025226|qwerty                                    
> |GBP|RPSS_21025226   |19-Jul-21| |Total @ -0.11| | |



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] NIFI-8932: Add capability to skip first N rows in CSVReader [nifi]

2023-10-29 Thread via GitHub


mattyb149 opened a new pull request, #7952:
URL: https://github.com/apache/nifi/pull/7952

   # Summary
   
   [NIFI-8932](https://issues.apache.org/jira/browse/NIFI-8932) This PR adds 
the capability to skip the first N rows of an incoming file to CSVReader, in 
the case of headers or other invalid records at the top of the FlowFile.
   
   # Tracking
   
   Please complete the following tracking steps prior to pull request creation.
   
   ### Issue Tracking
   
   - [x] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI) issue 
created
   
   ### Pull Request Tracking
   
   - [x] Pull Request title starts with Apache NiFi Jira issue number, such as 
`NIFI-0`
   - [x] Pull Request commit message starts with Apache NiFi Jira issue number, 
as such `NIFI-0`
   
   ### Pull Request Formatting
   
   - [x] Pull Request based on current revision of the `main` branch
   - [x] Pull Request refers to a feature branch with one commit containing 
changes
   
   # Verification
   
   Please indicate the verification steps performed prior to pull request 
creation.
   
   ### Build
   
   - [x] Build completed using `mvn clean install -P contrib-check`
 - [x] JDK 21
   
   ### Licensing
   
   - [ ] New dependencies are compatible with the [Apache License 
2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License 
Policy](https://www.apache.org/legal/resolved.html)
   - [ ] New dependencies are documented in applicable `LICENSE` and `NOTICE` 
files
   
   ### Documentation
   
   - [x] Documentation formatting appears as expected in rendered files
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (NIFI-8932) Add feature to CSVReader to skip N lines at top of the file

2023-10-29 Thread Matt Burgess (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17780663#comment-17780663
 ] 

Matt Burgess commented on NIFI-8932:


This Jira will be to add skipping top rows. Skipping bottom rows involves a 
different technique to read in N rows before returning any records and keeping 
track of them using a queue to feed more records into the bottom and clear the 
queue if no more records are available. Please feel free to file a follow-on 
Jira to handle skipping the last N rows.

> Add feature to CSVReader to skip N lines at top of the file
> ---
>
> Key: NIFI-8932
> URL: https://issues.apache.org/jira/browse/NIFI-8932
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Philipp Korniets
>Assignee: Matt Burgess
>Priority: Minor
> Fix For: 1.latest, 2.latest
>
>
> We have a lot of CSV files where provider add custom header/footer to valid 
> CSV content.
>  CSV header is actually second row. 
> To remove unnecessary data we can use
>  * ReplaceText 
>  * splitText->RouteOnAttribute -> MergeContent
> It would be great to have an option in CSVReader controller to skip N rows 
> from top/bottom in order to get5 clean data.
>  * skip N from the top
>  * skip M from the bottom
>  Similar request was developed in FLINK 
> https://issues.apache.org/jira/browse/FLINK-1002
>  
> Data Example:
> {code}
> 7/20/21 2:48:47 AM GMT-04:00  ABB: Blended Rate Calc (X),,,
> distribution_id,Distribution 
> Id,settle_date,group_code,company_name,currency_code,common_account_name,business_date,prod_code,security,class,asset_type
> -1,all,20210719,Repo 21025226,qwerty                                    
> ,EUR,TPSL_21025226   ,19-Jul-21,BRM96ST7   ,ABC 
> 14/09/24,NR,BOND  
> -1,all,20210719,Repo 21025226,qwerty                                    
> ,GBP,RPSS_21025226   ,19-Jul-21,,Total @ -0.11,,
> {code}
> |7/20/21 2:48:47 AM GMT-04:00  ABB: Blended Rate Calc (X)|  |  |  |  |  |  |  
> |  |  |  |  |  
> |distribution_id|Distribution 
> Id|settle_date|group_code|company_name|currency_code|common_account_name|business_date|prod_code|security|class|asset_type|
> |-1|all|20210719|Repo 21025226|qwerty                                    
> |EUR|TPSL_21025226   |19-Jul-21|BRM96ST7   |ABC 
> 14/09/24|NR|BOND  |
> |-1|all|20210719|Repo 21025226|qwerty                                    
> |GBP|RPSS_21025226   |19-Jul-21| |Total @ -0.11| | |



--
This message was sent by Atlassian Jira
(v8.20.10#820010)