damccorm opened a new issue, #20542:
URL: https://github.com/apache/beam/issues/20542

   Trying to download C4 via [these 
instructions]([https://github.com/google-research/text-to-text-transfer-transformer#c4)](https://github.com/google-research/text-to-text-transfer-transformer#c4))
 and 3 hours into my job I get this. Can't find any help on google for this 
error.
   
    
   
   Traceback (most recent call last):
    File 
"/usr/local/lib/python3.6/site-packages/dataflow_worker/batchworker.py", line 
649, in do_work
    work_executor.execute()
    File "/usr/local/lib/python3.6/site-packages/dataflow_worker/executor.py", 
line 179, in execute
    op.start()
    File "dataflow_worker/shuffle_operations.py", line 63, in 
dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
    File "dataflow_worker/shuffle_operations.py", line 64, in 
dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
    File "dataflow_worker/shuffle_operations.py", line 79, in 
dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
    File "dataflow_worker/shuffle_operations.py", line 80, in 
dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
    File "dataflow_worker/shuffle_operations.py", line 84, in 
dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
    File "apache_beam/runners/worker/operations.py", line 332, in 
apache_beam.runners.worker.operations.Operation.output
    File "apache_beam/runners/worker/operations.py", line 195, in 
apache_beam.runners.worker.operations.SingletonConsumerSet.receive
    File "dataflow_worker/shuffle_operations.py", line 261, in 
dataflow_worker.shuffle_operations.BatchGroupAlsoByWindowsOperation.process
    File "dataflow_worker/shuffle_operations.py", line 268, in 
dataflow_worker.shuffle_operations.BatchGroupAlsoByWindowsOperation.process
    File "apache_beam/runners/worker/operations.py", line 332, in 
apache_beam.runners.worker.operations.Operation.output
    File "apache_beam/runners/worker/operations.py", line 195, in 
apache_beam.runners.worker.operations.SingletonConsumerSet.receive
    File "apache_beam/runners/worker/operations.py", line 670, in 
apache_beam.runners.worker.operations.DoOperation.process
    File "apache_beam/runners/worker/operations.py", line 671, in 
apache_beam.runners.worker.operations.DoOperation.process
    File "apache_beam/runners/common.py", line 1215, in 
apache_beam.runners.common.DoFnRunner.process
    File "apache_beam/runners/common.py", line 1279, in 
apache_beam.runners.common.DoFnRunner._reraise_augmented
    File "apache_beam/runners/common.py", line 1213, in 
apache_beam.runners.common.DoFnRunner.process
    File "apache_beam/runners/common.py", line 569, in 
apache_beam.runners.common.SimpleInvoker.invoke_process
    File "apache_beam/runners/common.py", line 1371, in 
apache_beam.runners.common._OutputProcessor.process_outputs
    File "apache_beam/runners/worker/operations.py", line 195, in 
apache_beam.runners.worker.operations.SingletonConsumerSet.receive
    File "apache_beam/runners/worker/operations.py", line 670, in 
apache_beam.runners.worker.operations.DoOperation.process
    File "apache_beam/runners/worker/operations.py", line 671, in 
apache_beam.runners.worker.operations.DoOperation.process
    File "apache_beam/runners/common.py", line 1215, in 
apache_beam.runners.common.DoFnRunner.process
    File "apache_beam/runners/common.py", line 1294, in 
apache_beam.runners.common.DoFnRunner._reraise_augmented
    File "/usr/local/lib/python3.6/site-packages/future/utils/__init__.py", 
line 446, in raise_with_traceback
    raise exc.with_traceback(traceback)
    File "apache_beam/runners/common.py", line 1213, in 
apache_beam.runners.common.DoFnRunner.process
    File "apache_beam/runners/common.py", line 570, in 
apache_beam.runners.common.SimpleInvoker.invoke_process
    File 
"/mnt/pccfs/backed_up/crytting/persuasion/createc4/lib/python3.6/site-packages/apache_beam/transforms/core.py",
 line 815, in <lambda\>
    self.process = lambda element: fn(element)
   TypeError: clean_page() got an unexpected keyword argument 'badwords_regex' 
[while running 'clean_pages']
   
   Imported from Jira 
[BEAM-11098](https://issues.apache.org/jira/browse/BEAM-11098). Original Jira 
may contain additional context.
   Reported by: crytting.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to