[
https://issues.apache.org/jira/browse/ARROW-5799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wes McKinney updated ARROW-5799:
--------------------------------
Summary: [Python] Fail to write nested data to Parquet via BigQuery API
(was: NotImplementedError: struct<AddressId: int64, Country: string, County:
string, Flat: string, Locality: string, Number: string, PostCode: string,
Street: string, Town: string, Type: string>)
> [Python] Fail to write nested data to Parquet via BigQuery API
> --------------------------------------------------------------
>
> Key: ARROW-5799
> URL: https://issues.apache.org/jira/browse/ARROW-5799
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 0.13.0
> Environment: Python 3.6
> Reporter: David Draper
> Priority: Major
>
> I keep gettting the error in the title. any ideas on how to fix this issue?
> for company, credentials in loginCredentials.items():
> password = credentials["Password"]
> username = credentials["Username"]
> Academy = company
> Phase = credentials["Phase"]
> values = \{"grant_type": "password","username": username, "password":
> password}
> data = urlencode(values).encode()
> session = requests.Session()
> session.headers = {
> 'Content-Type': 'application/x-www-form-urlencoded'
> }
> response_body = session.post(TOKEN_API_URL, data=data)
> access_token = json.loads(response_body.text)["access_token"]
> #print(Academy +" "+ str(response_body.status_code)+" "+response_body.reason)
> session.headers = {
> 'Authorization': 'Bearer {}'.format(access_token)
> }
> #print(username + access_token)
> learner_responses = session.get(LEARNER_API_URL)
> learner_exclusions = session.get(LEARNER_EXCLUSIONS_URL)
> #print(Academy + " "+ str(learner_responses.status_code) +" "+
> learner_responses.reason)
> if learner_responses.status_code == 200:
> response = json.loads(learner_responses.text)
> learners = pd.DataFrame(response)
> learners['Establishment_Name'] = Academy
> learners['Establishment_Phase'] = Phase
> entries.append(learners)
> else:
> continue
>
> appended_data = pd.concat(entries, ignore_index=True)
>
> from google.cloud import bigquery
> project = 'aet-data-lake'
> client = bigquery.Client(credentials=credentials, project=project)
> dataset_ref = client.dataset('RAW')
> table_ref = dataset_ref.table('Learners_AET')
> job_config = bigquery.LoadJobConfig()
> job_config.autodetect = True
> client.load_table_from_dataframe(appended_data,
> table_ref,job_config=job_config).result()
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)