My environment is >>>> : 1 master 3 segments SQL >>>>>>: CREATE TABLE call_center ( cc_call_center_sk integer, cc_call_center_id character varying(16), cc_rec_start_date date, cc_rec_end_date date, cc_closed_date_sk integer, cc_open_date_sk integer, cc_name character varying(50), cc_class character varying(50), cc_employees integer, cc_sq_ft integer, cc_hours character varying(20), cc_manager character varying(40), cc_mkt_id integer, cc_mkt_class character varying(50), cc_mkt_desc character varying(100), cc_market_manager character varying(40), cc_division text, cc_division_name character varying(50), cc_company text, cc_company_name character varying(50), cc_street_number character varying(10), cc_street_name character varying(60), cc_street_type character varying(15), cc_suite_number character varying(10), cc_city character varying(60), cc_county character varying(30), cc_state text, cc_zip character varying(10), cc_country character varying(20), cc_gmt_offset numeric(5,2), cc_tax_percentage numeric(5,2) ) WITH (appendonly=true, orientation=parquet) DISTRIBUTED BY (cc_call_center_sk);
CREATE EXTERNAL TABLE ext_call_center (like call_center) LOCATION ( 'gpfdist://segment3:9001/call_center_[0-9]*_[0-9]*.dat', 'gpfdist://segment3:9002/call_center_[0-9]*_[0-9]*.dat', 'gpfdist://segment3:9003/call_center_[0-9]*_[0-9]*.dat', 'gpfdist://segment3:9004/call_center_[0-9]*_[0-9]*.dat', 'gpfdist://segment3:9005/call_center_[0-9]*_[0-9]*.dat', 'gpfdist://segment3:9006/call_center_[0-9]*_[0-9]*.dat', 'gpfdist://segment2:9001/call_center_[0-9]*_[0-9]*.dat', 'gpfdist://segment2:9002/call_center_[0-9]*_[0-9]*.dat', 'gpfdist://segment2:9003/call_center_[0-9]*_[0-9]*.dat', 'gpfdist://segment2:9004/call_center_[0-9]*_[0-9]*.dat', 'gpfdist://segment2:9005/call_center_[0-9]*_[0-9]*.dat', 'gpfdist://segment2:9006/call_center_[0-9]*_[0-9]*.dat', 'gpfdist://segment1:9001/call_center_[0-9]*_[0-9]*.dat', 'gpfdist://segment1:9002/call_center_[0-9]*_[0-9]*.dat', 'gpfdist://segment1:9003/call_center_[0-9]*_[0-9]*.dat', 'gpfdist://segment1:9004/call_center_[0-9]*_[0-9]*.dat', 'gpfdist://segment1:9005/call_center_[0-9]*_[0-9]*.dat', 'gpfdist://segment1:9006/call_center_[0-9]*_[0-9]*.dat') FORMAT 'TEXT' (DELIMITER '|' NULL AS '' ESCAPE AS E'\\'); insert into call_center select * from ext_call_center; ERROR: External scan error: There are more external files (URLs) than primary segments that can read them (COptTasks.cpp:1756) hawq config >>>>: GUC: add_missing_fromValue: off GUC: application_nameValue: GUC: array_nullsValue: on GUC: authentication_timeoutValue: 60 GUC: backslash_quoteValue: safe_encoding GUC: block_sizeValue: 32768 GUC: bonjour_nameValue: GUC: check_function_bodiesValue: on GUC: client_encodingValue: UTF8 GUC: client_min_messagesValue: ERROR GUC: cpu_index_tuple_costValue: 0.005 GUC: cpu_operator_costValue: 0.0025 GUC: cpu_tuple_costValue: 0.01 GUC: cursor_tuple_fractionValue: 1 GUC: custom_variable_classesValue: GUC: DateStyleValue: ISO, MDY GUC: db_user_namespaceValue: off GUC: deadlock_timeoutValue: 1000 GUC: debug_assertionsValue: off GUC: debug_pretty_printValue: off GUC: debug_print_parseValue: off GUC: debug_print_planValue: off GUC: debug_print_prelim_planValue: off GUC: debug_print_rewrittenValue: off GUC: debug_print_slice_tableValue: off GUC: default_hash_table_bucket_numberValue: 6 GUC: default_statement_memValue: 128000 GUC: default_statistics_targetValue: 25 GUC: default_tablespaceValue: GUC: default_transaction_isolationValue: read committed GUC: default_transaction_read_onlyValue: off GUC: dfs_urlValue: localhost:8020/hawq GUC: dynamic_library_pathValue: $libdir GUC: effective_cache_sizeValue: 16384 GUC: enable_bitmapscanValue: on GUC: enable_groupaggValue: on GUC: enable_hashaggValue: on GUC: enable_hashjoinValue: on GUC: enable_indexscanValue: on GUC: enable_mergejoinValue: off GUC: enable_nestloopValue: off GUC: enable_secure_filesystemValue: off GUC: enable_seqscanValue: on GUC: enable_sortValue: on GUC: enable_tidscanValue: on GUC: escape_string_warningValue: on GUC: explain_memory_verbosityValue: suppress GUC: explain_pretty_printValue: on GUC: extra_float_digitsValue: 0 GUC: filesystem_support_truncateValue: on GUC: from_collapse_limitValue: 20 GUC: gp_adjust_selectivity_for_outerjoinsValue: on GUC: gp_analyze_relative_errorValue: 0.25 GUC: gp_autostats_modeValue: NONE GUC: gp_autostats_on_change_thresholdValue: 2147483647 GUC: gp_backup_directIOValue: off GUC: gp_backup_directIO_read_chunk_mbValue: 20 GUC: gp_cached_segworkers_thresholdValue: 5 GUC: gp_command_countValue: 6 GUC: gp_connections_per_threadValue: 64 GUC: gp_debug_lingerValue: 0 GUC: gp_dynamic_partition_pruningValue: on GUC: gp_email_connect_avoid_durationValue: 7200 GUC: gp_email_connect_failuresValue: 5 GUC: gp_email_connect_timeoutValue: 15 GUC: gp_email_fromValue: GUC: gp_email_smtp_passwordValue: GUC: gp_email_smtp_serverValue: localhost:25 GUC: gp_email_smtp_useridValue: GUC: gp_email_toValue: GUC: gp_enable_agg_distinctValue: on GUC: gp_enable_agg_distinct_pruningValue: on GUC: gp_enable_direct_dispatchValue: on GUC: gp_enable_fallback_planValue: on GUC: gp_enable_fast_sriValue: on GUC: gp_enable_gpperfmonValue: off GUC: gp_enable_groupext_distinct_gatherValue: on GUC: gp_enable_groupext_distinct_pruningValue: on GUC: gp_enable_multiphase_aggValue: on GUC: gp_enable_predicate_propagationValue: on GUC: gp_enable_preuniqueValue: on GUC: gp_enable_sequential_window_plansValue: on GUC: gp_enable_sort_distinctValue: on GUC: gp_enable_sort_limitValue: on GUC: gp_external_enable_execValue: on GUC: gp_external_grant_privilegesValue: off GUC: gp_external_max_segsValue: 64 GUC: gp_filerep_tcp_keepalives_countValue: 2 GUC: gp_filerep_tcp_keepalives_idleValue: 60 GUC: gp_filerep_tcp_keepalives_intervalValue: 30 GUC: gp_force_use_default_temporary_directoryValue: off GUC: gp_gpperfmon_send_intervalValue: 1 GUC: gp_hashjoin_tuples_per_bucketValue: 5 GUC: gp_idf_deduplicateValue: auto GUC: gp_interconnect_cache_future_packetsValue: on GUC: gp_interconnect_default_rttValue: 20 GUC: gp_interconnect_fc_methodValue: LOSS GUC: gp_interconnect_hash_multiplierValue: 2 GUC: gp_interconnect_min_retries_before_timeoutValue: 100 GUC: gp_interconnect_min_rtoValue: 20 GUC: gp_interconnect_queue_depthValue: 4 GUC: gp_interconnect_setup_timeoutValue: 7200 GUC: gp_interconnect_snd_queue_depthValue: 2 GUC: gp_interconnect_timer_checking_periodValue: 20 GUC: gp_interconnect_timer_periodValue: 5 GUC: gp_interconnect_transmit_timeoutValue: 3600 GUC: gp_interconnect_typeValue: UDP GUC: gp_log_formatValue: csv GUC: gp_max_csv_line_lengthValue: 1048576 GUC: gp_max_databasesValue: 16 GUC: gp_max_filespacesValue: 8 GUC: gp_max_packet_sizeValue: 8192 GUC: gp_max_partition_levelValue: 0 GUC: gp_max_plan_sizeValue: 0 GUC: gp_max_relationsValue: 65536 GUC: gp_max_tablespacesValue: 16 GUC: gp_motion_cost_per_rowValue: 0 GUC: gp_num_contents_in_clusterValue: -10000 GUC: gp_query_context_mem_limitValue: 102400 GUC: gp_reject_percent_thresholdValue: 300 GUC: gp_reraise_signalValue: on GUC: gp_roleValue: utility GUC: gp_safefswritesizeValue: 0 GUC: gp_segment_connect_timeoutValue: 600 GUC: gp_segments_for_plannerValue: 0 GUC: gp_session_idValue: -1 GUC: gp_set_proc_affinityValue: off GUC: gp_snmp_communityValue: public GUC: gp_snmp_monitor_addressValue: GUC: gp_snmp_use_inform_or_trapValue: trap GUC: gp_statistics_pullup_from_child_partitionValue: on GUC: gp_statistics_use_fkeysValue: on GUC: gp_subtrans_warn_limitValue: 16777216 GUC: gp_temporary_directory_mark_errorValue: 0 GUC: gp_udp_bufsize_kValue: 0 GUC: gp_vmem_protect_segworker_cache_limitValue: 500 GUC: gp_workfile_checksummingValue: on GUC: gp_workfile_compress_algorithmValue: none GUC: gp_workfile_limit_per_queryValue: 0 GUC: gp_workfile_limit_per_segmentValue: 0 GUC: gpperfmon_portValue: 8888 GUC: hawq_global_rm_typeValue: yarn GUC: hawq_master_address_hostValue: master.bigdata GUC: hawq_master_address_portValue: 6432 GUC: hawq_master_temp_directoryValue: /tmp GUC: hawq_metadata_cache_block_capacityValue: 2097152 GUC: hawq_metadata_cache_check_intervalValue: 30 GUC: hawq_metadata_cache_flush_ratioValue: 0.85 GUC: hawq_metadata_cache_free_block_max_ratioValue: 0.05 GUC: hawq_metadata_cache_free_block_normal_ratioValue: 0.2 GUC: hawq_metadata_cache_max_hdfs_file_numValue: 524288 GUC: hawq_metadata_cache_reduce_ratioValue: 0.7 GUC: hawq_metadata_cache_refresh_intervalValue: 3600 GUC: hawq_metadata_cache_refresh_max_numValue: 1000 GUC: hawq_metadata_cache_refresh_timeoutValue: 3600 GUC: hawq_re_cgroup_hierarchy_nameValue: hadoop-yarn GUC: hawq_re_cgroup_mount_pointValue: /sys/fs/cgroup GUC: hawq_re_cpu_enableValue: off GUC: hawq_re_cpu_weightValue: 1024 GUC: hawq_re_memory_overcommit_maxValue: 8192 GUC: hawq_re_vcore_pcore_ratioValue: 1 GUC: hawq_rm_cluster_report_periodValue: 60 GUC: hawq_rm_clusterratio_core_to_memorygb_factorValue: 5 GUC: hawq_rm_connpool_sameaddr_buffersizeValue: 2 GUC: hawq_rm_container_batch_limitValue: 1000 GUC: hawq_rm_enable_connpoolValue: on GUC: hawq_rm_force_alterqueue_cancel_queued_requestValue: on GUC: hawq_rm_force_fifo_queuingValue: on GUC: hawq_rm_master_domain_portValue: 5436 GUC: hawq_rm_master_portValue: 5437 GUC: hawq_rm_memory_limit_persegValue: 480GB GUC: hawq_rm_min_resource_persegValue: 2 GUC: hawq_rm_nocluster_timeoutValue: 60 GUC: hawq_rm_nresqueue_limitValue: 128 GUC: hawq_rm_nslice_perseg_limitValue: 5000 GUC: hawq_rm_nvcore_limit_persegValue: 16 GUC: hawq_rm_nvseg_for_analyze_nopart_perquery_limitValue: 512 GUC: hawq_rm_nvseg_for_analyze_nopart_perquery_perseg_limitValue: 8 GUC: hawq_rm_nvseg_for_analyze_part_perquery_limitValue: 256 GUC: hawq_rm_nvseg_for_analyze_part_perquery_perseg_limitValue: 4 GUC: hawq_rm_nvseg_for_copy_from_perqueryValue: 6 GUC: hawq_rm_nvseg_perquery_limitValue: 512 GUC: hawq_rm_nvseg_perquery_perseg_limitValue: 6 GUC: hawq_rm_nvseg_variance_amon_seg_limitValue: 1 GUC: hawq_rm_nvseg_variance_amon_seg_respool_limitValue: 2 GUC: hawq_rm_regularize_io_factorValue: 1 GUC: hawq_rm_regularize_io_maxValue: 1.37439e+11 GUC: hawq_rm_regularize_nvseg_factorValue: 1 GUC: hawq_rm_regularize_nvseg_maxValue: 300 GUC: hawq_rm_regularize_usage_factorValue: 1 GUC: hawq_rm_rejectrequest_nseg_limitValue: 0.25 GUC: hawq_rm_request_timeoutcheck_intervalValue: 1 GUC: hawq_rm_resource_allocation_timeoutValue: 600 GUC: hawq_rm_resource_idle_timeoutValue: 300 GUC: hawq_rm_respool_test_fileValue: GUC: hawq_rm_return_percent_on_overcommitValue: 10 GUC: hawq_rm_segment_config_refresh_intervalValue: 30 GUC: hawq_rm_segment_heartbeat_intervalValue: 30 GUC: hawq_rm_segment_heartbeat_timeoutValue: 300 GUC: hawq_rm_segment_portValue: 5438 GUC: hawq_rm_segment_tmpdir_detect_intervalValue: 300 GUC: hawq_rm_session_lease_heartbeat_enableValue: on GUC: hawq_rm_session_lease_heartbeat_intervalValue: 10 GUC: hawq_rm_session_lease_timeoutValue: 180 GUC: hawq_rm_stmt_nvsegValue: 0 GUC: hawq_rm_stmt_vseg_memoryValue: 128mb GUC: hawq_rm_tolerate_nseg_limitValue: 0.25 GUC: hawq_rm_yarn_addressValue: worker1.bigdata:8050 GUC: hawq_rm_yarn_app_nameValue: hawq GUC: hawq_rm_yarn_queue_nameValue: default GUC: hawq_rm_yarn_scheduler_addressValue: worker1.bigdata:8030 GUC: hawq_segment_address_portValue: 40000 GUC: hawq_segment_history_keep_periodValue: 365 GUC: hawq_segment_temp_directoryValue: /tmp GUC: integer_datetimesValue: on GUC: IntervalStyleValue: postgres GUC: join_collapse_limitValue: 20 GUC: krb5_ccnameValue: /tmp/postgres.ccname GUC: krb_caseins_usersValue: off GUC: krb_server_keyfileValue: FILE:/data/pulse2-agent/agents/agent1/work/HAWQ-main-opt/rhel5_x86_64/src/hawq-db-dist/etc/krb5.keytab GUC: krb_srvnameValue: postgres GUC: lc_collateValue: C GUC: lc_ctypeValue: C GUC: lc_messagesValue: en_US.utf8 GUC: lc_monetaryValue: en_US.utf8 GUC: lc_numericValue: en_US.utf8 GUC: lc_timeValue: en_US.utf8 GUC: listen_addressesValue: * GUC: local_preload_librariesValue: GUC: log_autostatsValue: off GUC: log_connectionsValue: off GUC: log_disconnectionsValue: off GUC: log_dispatch_statsValue: off GUC: log_durationValue: off GUC: log_error_verbosityValue: default GUC: log_executor_statsValue: off GUC: log_hostnameValue: off GUC: log_min_duration_statementValue: -1 GUC: log_min_error_statementValue: error GUC: log_min_messagesValue: warning GUC: log_parser_statsValue: off GUC: log_planner_statsValue: off GUC: log_rotation_ageValue: 1440 GUC: log_rotation_sizeValue: 0 GUC: log_statementValue: none GUC: log_statement_statsValue: off GUC: log_timezoneValue: PRC GUC: log_truncate_on_rotationValue: off GUC: maintenance_work_memValue: 65536 GUC: master_directoryValue: GUC: max_appendonly_segfilesValue: 262144 GUC: max_appendonly_tablesValue: 10000 GUC: max_connectionsValue: 1280 GUC: max_files_per_processValue: 150 GUC: max_fsm_pagesValue: 200000 GUC: max_fsm_relationsValue: 1000 GUC: max_function_argsValue: 100 GUC: max_identifier_lengthValue: 63 GUC: max_index_keysValue: 32 GUC: max_locks_per_transactionValue: 128 GUC: max_prepared_transactionsValue: 250 GUC: max_stack_depthValue: 2048 GUC: max_work_memValue: 1024000 GUC: metadata_cache_testfileValue: GUC: optimizerValue: on GUC: optimizer_analyze_root_partitionValue: on GUC: optimizer_minidumpValue: onerror GUC: optimizer_parts_to_force_sort_on_insertValue: 160 GUC: password_encryptionValue: on GUC: password_hash_algorithmValue: MD5 GUC: pljava_classpathValue: GUC: pljava_release_lingering_savepointsValue: off GUC: pljava_statement_cache_sizeValue: 0 GUC: pljava_vmoptionsValue: GUC: portValue: 6432 GUC: pxf_enable_filter_pushdownValue: on GUC: pxf_enable_locality_optimizationsValue: on GUC: pxf_enable_stat_collectionValue: on GUC: pxf_remote_service_loginValue: GUC: pxf_remote_service_secretValue: GUC: pxf_service_addressValue: localhost:51200 GUC: pxf_stat_max_fragmentsValue: 100 GUC: random_page_costValue: 100 GUC: regex_flavorValue: advanced GUC: runaway_detector_activation_percentValue: 95 GUC: search_pathValue: "$user",public GUC: seg_max_connectionsValue: 3000 GUC: segment_directoryValue: GUC: seq_page_costValue: 1 GUC: server_encodingValue: UTF8 GUC: server_ticket_renew_intervalValue: 43200000 GUC: server_versionValue: 8.2.15 GUC: server_version_numValue: 80215 GUC: shared_buffersValue: 4000 GUC: shared_preload_librariesValue: GUC: sslValue: off GUC: ssl_ciphersValue: ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH GUC: standard_conforming_stringsValue: off GUC: standby_address_hostValue: localhost GUC: statement_timeoutValue: 0 GUC: superuser_reserved_connectionsValue: 3 GUC: tcp_keepalives_countValue: 9 GUC: tcp_keepalives_idleValue: 7200 GUC: tcp_keepalives_intervalValue: 75 GUC: temp_buffersValue: 1024 GUC: TimeZoneValue: PRC GUC: timezone_abbreviationsValue: Default GUC: track_activitiesValue: on GUC: track_countsValue: off GUC: transaction_isolationValue: read committed GUC: transaction_read_onlyValue: off GUC: transform_null_equalsValue: off GUC: unix_socket_directoryValue: GUC: unix_socket_groupValue: GUC: unix_socket_permissionsValue: 511 GUC: update_process_titleValue: on GUC: vacuum_cost_delayValue: 0 GUC: vacuum_cost_limitValue: 200 GUC: vacuum_cost_page_dirtyValue: 20 GUC: vacuum_cost_page_missValue: 10 GUC: vacuum_freeze_min_ageValue: 100000000 GUC: work_memValue: 51200 At 2016-09-21 13:00:41, "Vineet Goel" <vvin...@apache.org> wrote: Could you please post your SQL DDL statement? How many URLs do you have in your external table? Also, your HASH dist table - how many buckets are defined, if any? Are the # of URLs more than the # of buckets or default_hash_table_bucket_number value? Perhaps you can attach your hawq-site.xml file as well. Also see: http://hdb.docs.pivotal.io/20/datamgmt/load/g-gpfdist-protocol.html Thanks Vineet On Tue, Sep 20, 2016 at 7:07 PM 来熊 <yin....@163.com> wrote: Hi,all: I am testing hawq 2.0.0 , and I find a problem like this: I load data from an external table (created using "like target_table" statement) , if the target table was distributed by some column(s), it raise this error: External scan error: There are more external files (URLs) than primary segments that can read them (COptTasks.cpp:1756) if the target table was distributed randomly, it works well, I don't set any parameter special,does anybody know how to resolve this problem? thanks a lot.