This is an automated email from the ASF dual-hosted git repository.
wphyo pushed a change to branch aqacf-2
in repository https://gitbox.apache.org/repos/asf/sdap-in-situ-data-services.git
at 05c8e03 fix: still cannot start pyspark
This branch includes the following new commits:
new 6cc8e49 Merge AQACF specific code
new d615b4e Update sub_collection_statistics code to work with AQ
new d746380 Update parquet_file_es_indexer code to work with AQ
new 8f7db69 Add partition by site
new 9ea0877 Update index_to_es lambda to make it work with AQ
new 733a6bb Fix "site" in ingestion and query
new 6e714f6 Fix more issue with "site" in ingestion
new 150de39 Update sub collection statistics format
new ad2e091 Update readme to include instruction to build lambda container
new 4bca10f Revert back to using "platform" instead of "site"
new 4f99eef Change platform_code to platform_id
new e529545 Add capability to query by platform (platform_id)
new f7d2cce Fix query by platform id bug
new 69ba023 Add platform short name
new 089a099 Fix parquet short name related issues
new 126b45f All query by multiple providers and projects
new 5f18a43 feat: add relative humidity
new b2871b7 feat: airnow etl job
new d167787 feat: add tf + update setup for lambda
new f378256 feat: use requirements.txt to install
new e4c79db fix: add lat-lon. no need to run 2 times
new 4292f2e feat: add merge logic
new 6d9b7f1 fix: convert time column into formatted string
new 3d12900 feat: convert to parquet json format
new 38301d4 feat: make entire workflow + add log stmts
new 6d94e8b feat: add validation on entry script
new 1fb237f feat: add Jenkins
new 3e4cc23 fix: typo in Jenkins
new 0a6bc65 fix: hide default ENV + sample docker-compose file
new 3714254 feat: get terraform running
new 4fcadcd fix: reduce parallel size
new 850f5d2 fix: not using parallel
new cc58aa8 fix: remove break statement for debug purpose
new a325bf3 fix: split size from ENV
new 5261149 fix: write to json in batch manually assuming not enough
memory
new 0ee5b0e fix: using manual temp folder is cauisng the error
new aac00d5 fix: forgot to update underlying method
new 3281779 fix: run in chunks to reduce cpu
new 165ef9e fix: raw data needs to be in raw project
new f4b136b fix: running it one by one
new c3c9072 fix: only working on downloaded data
new 0360f45 fix: adding cql filter
new 4e62014 fix: need prefix to query in elasticsearch
new 7f8df43 fix: calling observation_counts from constant file
new 648cb96 feat: add cql filter logic when pulling data
new 2347e78 fix: optional depth stats + observation stats
new 44c7c81 chore: add log to see why there is ((None AND (time_obj#41 >=
cast(2023-08-01T00:00:00Z as timestamp)))
new d65f79a fix: filter_cql validation logic error
new 1dc77d9 feat: docker lambda in ci/cd
new e1dcce9 fix: python 3.7 residues
new ed95fc8 fix: python3.8 did not work. trying 3.7
new 965c9d5 fix: use requirement.txt to build it
new e91081f fix: using python3
new a5ed355 fix: still fixing docker build
new 123cb7c fix: wrong line in requirements file
new f78cb88 fix: remove some lib
new 51d34f8 fix: remove some lib
new 213d362 fix: still docker build error
new 38f6f83 fix: trying 3.8 now
new 934b2c7 fix: revert to python3.7
new bf4584d fix: allow to filter statistics
new d22304f feat: first version to ingest to ES
new 3e5fb79 fix: ES mapping
new d8c6429 fix: using pandas to mask some erros and speed things up
new 2d6d487 fix: refactor
new 19f3f04 feat: adding query endpoint
new 2643f9a fix: adding lat lon constraint
new f1b3cfd fix: forgot to add new endpoints
new 1dc1da8 fix: some tweak for airnow data
new d7c9879 fix: hiding next page if this is the last page. not just
empty page
new ab88742 fix: some new endpoints for stats checking
new 8b7cf3a fix: some minor tweaks
new 09255d9 fix: agg for platform only
new 4f2e29d fix: pagination works now
new 8bf5a5f fix: add pagination to result
new 88a1f4b fix: need a different way to default the size
new d34ef01 fix: size is retrieved before assigned
new 1f0dd44 fix: need pagination input and output endpoint. standardize
w/ same name from query
new e5c4a94 fix: create url. not just the marker platform
new 8b8dabb fix: ingestion lambdas
new b4d5cd7 fix: add missing pieces on terraform
new f0d7de3 fix: make some tweaks
new ac45984 fix: tf module should be working
new fcba9df feat: adding new ingester (in progress)
new fcc841c fix: tweak for nc ingestion from aws
new 9816292 feat: add 2 new endpoint
new bd8bfc9 fix: wrong int name in json schema
new 992ef09 fix: add new dependencies
new eafab94 fix: build failed
new 85d7c9c feat: adding zip file to artifactory
new 04ec5ad fix: adding artifactory
new 09b8a0c fix: still cred error
new d5ce3d3 fix: weird errorj
new 51b0606 fix: pushing to wrong place
new 2e1f7f4 fix: bug in ingestion staging lambda
new d77c8c8 fix: still bugs in ingestion pipeline
new 80c1918 fix: sns class added
new b1d6ca3 fix: typo
new 80ecf39 feat: redo ingestion via plugin style
new 2148596 fix: terraform sns fix (not tested)
new 5a1bf1b fix: workaround to send results
new 799f6f7 fix: trying using plugins
new 5736310 fix: fixing duplicated last item on pagination
new 61d7204 fix: next is still duplicated
new 780fc77 chore: shifting jenkins stage
new 8511fda fix: redo docker to use netcdf4 from conda
new 0fcc892 fix: a certain netcdf4 version
new 08bf4a6 fix: trying netcdf4 suggestion from stackoverflow
new a38e6ae fix: conditionally setting empty string if no short name
new d5233a0 fix: wrong validation code for sqs-sns-s3
new df83930 fix: add test and update errors
new 851c821 fix: update units on no2 and coj
new fdf1969 Merge branch 'master' into aqacf
new 23a7b6f chore: update lambda rquirement
new a1c44d2 fix: option to add geo_partition
new 4140a13 fix: use ENV value with default backward compatibility for
geo_interval
new a4ebf54 feat: allow gmu query (raw)
new ba6a835 fix: downgrade pyspark
new 3c806f0 fix: gmu errors
new 817e227 fix: get pagination working
new 0c7ab92 feat: use factory method
new 4687616 fix: gmu url comes from config
new 5e29764 fix: need to update deployment.yaml
new 25fbbd0 fix: secret scan updates
new 0ac7c23 fix: re-adding depth
new b19ba49 fix: federated stats page entry
new 96db4d9 add new index
new f7073c4 fix: add federated endpoints
new 85d179f fix: add auth to insert records
new 8c9881e fix: add missing file
new 4b60c2b fix: add missing import
new 47a70a3 fix: add sorting statement
new f0d0fed fix: implement es delete method + bug in validation
new ab07d98 fix: wrong key during query
new 41f3be1 feat: add code for gmu stats page + test code
new 95d6c3e fix: add test cases for gmu
new 8b17cb0 fix: some bug fixes + test
new e050ef0 fix: date range need to be something
new 8f8180a fix: chunking json file so that it does not error out
new 982919d fix: updating gmu
new afa1cd9 fix: update insitu test case
new ac07710 fix: provider is an array
new e90be21 fix: provider can be a list
new 9967030 fix: add providers in stats
new c754da3 fix: update stats structure
new 5b0b5cd fix: platforms are in string
new 4d415c2 fix: validate empty resutl
new ca22649 fix: make error more descriptive
new a5c2a84 fix: update stats
new 81aa23e fix: pushing quant aq file
new 587a980 feat: add raw endpoint
new 72b03fd fix: allow user to provide chunk size
new 15e723c fix: add missing provider + project
new 2bc1a3e fix: wrong path name
new 79a11f2 fix: cleaning up unused code
new cbf5a10 fix: remove null / nan values in min-max
new f71a42f fix: trying to fix lambda pyspark error /dev/fd/62: No such
file or directory
new e14253a fix: cannot create directory
new 3704a66 fix: lambda spark not starting
new 2bd0c06 fix: downgrade java
new 71e5dae fix: update all versions
new 05c8e03 fix: still cannot start pyspark
The 162 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "add" were already present in the repository and have only
been added to this reference.