[GitHub] incubator-hawq pull request: HAWQ-703. Serialize HCatalog Complex ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/633#discussion_r61827025 --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/utilities/EnumHawqType.java --- @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.hawq.pxf.api.utilities; + +import java.io.IOException; +import org.codehaus.jackson.JsonGenerator; +import org.codehaus.jackson.map.JsonSerializer; +import org.codehaus.jackson.map.annotate.JsonSerialize; +import org.codehaus.jackson.map.SerializerProvider; +import org.codehaus.jackson.JsonProcessingException; + +class EnumHawqTypeSerializer extends JsonSerializer { + +@Override +public void serialize(EnumHawqType value, JsonGenerator generator, + SerializerProvider provider) throws IOException, + JsonProcessingException { + generator.writeString(value.getTypeName()); --- End diff -- is it planned to add the modifiers to the serialized data? if so, maybe add a TODO here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-703. Serialize HCatalog Complex ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/633#discussion_r61826885 --- Diff: src/bin/psql/describe.c --- @@ -4263,8 +4263,13 @@ describePxfTable(const char *profile, const char *pattern, bool verbose) printQueryOpt myopt = pset.popt; printTableContent cont; int cols = 0; + if (verbose) + { --- End diff -- remove brackets here or add them to the else block --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-703. Serialize HCatalog Complex ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/633#discussion_r61819180 --- Diff: pxf/pxf-service/src/test/java/org/apache/hawq/pxf/service/MetadataResponseFormatterTest.java --- @@ -85,26 +86,45 @@ public void formatResponseStringWithModifiers() throws Exception { List fields = new ArrayList(); Metadata.Item itemName = new Metadata.Item("default", "table1"); Metadata metadata = new Metadata(itemName, fields); -fields.add(new Metadata.Field("field1", "int")); -fields.add(new Metadata.Field("field2", "numeric", +fields.add(new Metadata.Field("field1", EnumHawqType.Int8Type, "bigint")); +fields.add(new Metadata.Field("field2", EnumHawqType.NumericType, "decimal", new String[] {"1349", "1789"})); -fields.add(new Metadata.Field("field3", "char", +fields.add(new Metadata.Field("field3", EnumHawqType.BpcharType, "char", new String[] {"50"})); metadataList.add(metadata); response = MetadataResponseFormatter.formatResponse(metadataList, "path.file"); StringBuilder expected = new StringBuilder("{\"PXFMetadata\":[{"); expected.append("\"item\":{\"path\":\"default\",\"name\":\"table1\"},") .append("\"fields\":[") -.append("{\"name\":\"field1\",\"type\":\"int\"},") - .append("{\"name\":\"field2\",\"type\":\"numeric\",\"modifiers\":[\"1349\",\"1789\"]},") - .append("{\"name\":\"field3\",\"type\":\"char\",\"modifiers\":[\"50\"]}") + .append("{\"name\":\"field1\",\"type\":\"int8\",\"sourceType\":\"bigint\"},") + .append("{\"name\":\"field2\",\"type\":\"numeric\",\"sourceType\":\"decimal\",\"modifiers\":[\"1349\",\"1789\"]},") + .append("{\"name\":\"field3\",\"type\":\"bpchar\",\"sourceType\":\"char\",\"modifiers\":[\"50\"]}") .append("]}]}"); assertEquals(expected.toString(), convertResponseToString(response)); } @Test +public void formatResponseStringWithSourceType() throws Exception { +List metadataList = new ArrayList(); +List fields = new ArrayList(); +Metadata.Item itemName = new Metadata.Item("default", "table1"); +Metadata metadata = new Metadata(itemName, fields); +fields.add(new Metadata.Field("field1", EnumHawqType.Float8Type, "double")); +metadataList.add(metadata); + +response = MetadataResponseFormatter.formatResponse(metadataList, "path.file"); +StringBuilder expected = new StringBuilder("{\"PXFMetadata\":[{"); + expected.append("\"item\":{\"path\":\"default\",\"name\":\"table1\"},") +.append("\"fields\":[") + .append("{\"name\":\"field1\",\"type\":\"float8\",\"sourceType\":\"double\"}") +.append("]}]}"); + +//assertEquals(expected.toString(), convertResponseToString(response)); --- End diff -- uncomment? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-703. Serialize HCatalog Complex ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/633#discussion_r61818951 --- Diff: pxf/pxf-hive/src/main/java/org/apache/hawq/pxf/plugins/hive/utilities/HiveUtilities.java --- @@ -186,7 +155,7 @@ public static Table getHiveTable(HiveMetaStoreClient client, Metadata.Item itemN * @param modifiers type modifiers to be verified * @return whether modifiers are null or integers */ -private static boolean verifyModifers(String[] modifiers) { +private static boolean verifyIntegerModifers(String[] modifiers) { --- End diff -- typo Modifiers --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-703. Serialize HCatalog Complex ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/633#discussion_r61818778 --- Diff: pxf/pxf-hive/src/main/java/org/apache/hawq/pxf/plugins/hive/utilities/HiveUtilities.java --- @@ -88,94 +90,61 @@ public static Table getHiveTable(HiveMetaStoreClient client, Metadata.Item itemN * Unsupported types will result in an exception. * * The supported mappings are: - * {@code tinyint -> int2} - * {@code smallint -> int2} - * {@code int -> int4} - * {@code bigint -> int8} - * {@code boolean -> bool} - * {@code float -> float4} - * {@code double -> float8} - * {@code string -> text} - * {@code binary -> bytea} - * {@code timestamp -> timestamp} - * {@code date -> date} - * {@code decimal(precision, scale) -> numeric(precision, scale)} - * {@code varchar(size) -> varchar(size)} - * {@code char(size) -> bpchar(size)} + * {@code tinyint -> int2} --- End diff -- indentation? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-703. Serialize HCatalog Complex ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/633#discussion_r61818748 --- Diff: pxf/pxf-hive/src/main/java/org/apache/hawq/pxf/plugins/hive/utilities/HiveUtilities.java --- @@ -20,9 +20,12 @@ */ +import java.util.Arrays; import java.util.List; import java.util.ArrayList; +import org.apache.hawq.pxf.api.utilities.EnumHawqType; --- End diff -- please put these imports together with the other pxf imports below. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-703. Serialize HCatalog Complex ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/633#discussion_r61818073 --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/Metadata.java --- @@ -67,36 +68,43 @@ public String toString() { } /** - * Class representing item field - name and type. + * Class representing item field - name, type, source type, modifiers. + * Type - exposed type of field + * Source type - type of field in underlying source + * Modifiers - additional attributes which describe type or field */ public static class Field { private String name; -private String type; // TODO: change to enum +private EnumHawqType type; // field type which PXF exposes +private String sourceType; // filed type PXF reads from private String[] modifiers; // type modifiers, optional field -public Field(String name, String type) { - -if (StringUtils.isBlank(name) || StringUtils.isBlank(type)) { -throw new IllegalArgumentException("Field name and type cannot be empty"); -} - -this.name = name; -this.type = type; +public Field(String name, EnumHawqType type, String sourceType) { +if (StringUtils.isBlank(name) || StringUtils.isBlank(type.getTypeName()) --- End diff -- type could be null, right? maybe just check that it's not null - it probably can't be an empty string because it's an enum. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Fix Orca error message: Orca supports...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/630#discussion_r61138397 --- Diff: src/backend/gpopt/translate/CTranslatorUtils.cpp --- @@ -1583,9 +1585,9 @@ CTranslatorUtils::PdrgpbsGroupBy return PdrgpbsRollup(pmp, pgrcl, ulCols, phmululGrpColPos, pbsGrpCols); } - if (GROUPINGTYPE_GROUPING_SETS != pgrcl->groupType) + if (GROUPINGTYPE_CUBE == pgrcl->groupType) { - GPOS_RAISE(gpdxl::ExmaDXL, gpdxl::ExmiQuery2DXLUnsupportedFeature, GPOS_WSZ_LIT("Rollup and cube")); + GPOS_RAISE(gpdxl::ExmaDXL, gpdxl::ExmiQuery2DXLUnsupportedFeature, GPOS_WSZ_LIT(âCubeâ)); --- End diff -- the quote marks look off. should be `"` and not `â` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-683. Fix param name for Protocol...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/625#discussion_r60321148 --- Diff: pxf/README.md --- @@ -20,7 +20,7 @@ Package Contents PXF is distributed as a set of RPMs - - +git dif --- End diff -- dif? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-615. Handle incomptible tables w...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/551#discussion_r58473430 --- Diff: pxf/pxf-hive/src/main/java/org/apache/hawq/pxf/plugins/hive/HiveMetadataFetcher.java --- @@ -62,11 +74,24 @@ public HiveMetadataFetcher(InputData md) { List metadataList = new ArrayList(); +if(tblsDesc.size() > 1) { --- End diff -- What if there are 2 tables but both are not supported? in that case the function will return 0 tables. Is that an acceptable scenario? if so, perhaps add a test to cover it? Did you consider changing the condition to have at least one 'good' table - check in the end of the loop if there metadataList is not empty. If it is, throw an exception. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-615. Handle incomptible tables w...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/551#discussion_r58473077 --- Diff: pxf/pxf-hive/src/test/java/org/apache/hawq/pxf/plugins/hive/HiveMetadataFetcherTest.java --- @@ -154,6 +155,115 @@ public void getTableMetadata() throws Exception { assertEquals("int4", field.getType()); } +@Test +public void getTableMetadataWithMultipleTables() throws Exception { +prepareConstruction(); + +fetcher = new HiveMetadataFetcher(inputData); + +String tablepattern = "*"; +String dbpattern = "*"; +String dbname = "default"; +String tablenamebase = "regulartable"; +String pattern = dbpattern + "." + tablepattern; + +List dbNames = new ArrayList(Arrays.asList(dbname)); +List tableNames = new ArrayList(); + +// Prepare for tables +List fields = new ArrayList(); +fields.add(new FieldSchema("field1", "string", null)); +fields.add(new FieldSchema("field2", "int", null)); +StorageDescriptor sd = new StorageDescriptor(); +sd.setCols(fields); + +// Mock hive tables returned from hive client +for(int index=1;index<=2;index++) { +String tableName = tablenamebase + index; +tableNames.add(tableName);; --- End diff -- two ; --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-601. Clear external table code.
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/525#discussion_r57665601 --- Diff: src/backend/cdb/cdbdatalocality.c --- @@ -768,12 +769,23 @@ static void check_keep_hash_and_external_table( * whose default value is set to default_segment_num */ ExtTableEntry* extEnrty = GetExtTableEntry(rel->rd_id); - if(extEnrty->command){ - // command external table case - if (context->externTableOnClauseSegNum == 0) { - context->externTableOnClauseSegNum = targetPolicy->bucketnum; + + bool isPxf = false; + if(!extEnrty->command && extEnrty->locations){ --- End diff -- missing spaces after `if` and before `{` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-577. Updated PXF metadata api to...
Github user hornn commented on the pull request: https://github.com/apache/incubator-hawq/pull/522#issuecomment-202643273 +1. looks good --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-577. Updated PXF metadata api to...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/522#discussion_r57654981 --- Diff: pxf/pxf-service/src/main/java/org/apache/hawq/pxf/service/rest/MetadataResource.java --- @@ -41,6 +41,7 @@ import org.apache.hawq.pxf.api.MetadataFetcher; import org.apache.hawq.pxf.api.utilities.InputData; import org.apache.hawq.pxf.service.MetadataFetcherFactory; +import org.apache.hawq.pxf.service.MetadataResponse; --- End diff -- is this a new file? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-577. Updated PXF metadata api to...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/522#discussion_r57654541 --- Diff: pxf/pxf-service/src/main/java/org/apache/hawq/pxf/service/MetadataResponseFormatter.java --- @@ -46,76 +45,50 @@ * @return JSON formatted response * @throws IOException if converting the data to JSON fails */ -public static String formatResponseString(List metadataList) throws IOException { -return MetadataResponseFormatter.metadataToJSON(metadataList); -} - -/** - * Serializes a metadata in JSON, - * To be used as the result string for HAWQ. - * An example result is as follows: - * - * {"PXFMetadata":[{"item":{"path":"default","name":"t1"},"fields":[{"name":"a","type":"int"},{"name":"b","type":"float"}]}]} - */ -private static String metadataToJSON(List metadataList) throws IOException { - -if (metadataList == null || metadataList.isEmpty()) { - return METADATA_DEFAULT_RESPONSE; +public static MetadataResponse formatResponse(List metadataList, String path) throws IOException { +/* print the fragment list to log when in debug level */ +if (LOG.isDebugEnabled()) { +MetadataResponseFormatter.printMetadata(metadataList, path); } -StringBuilder result = null; - -for(Metadata metadata: metadataList) { -if(metadata == null) { -throw new IllegalArgumentException("metadata object is null - cannot serialize"); -} -if ((metadata.getFields() == null) || metadata.getFields().isEmpty()) { -throw new IllegalArgumentException("metadata for " + metadata.getItem() + " contains no fields - cannot serialize"); -} -if (result == null) { -result = new StringBuilder("{\"PXFMetadata\":["); /* prefix info */ -} else { -result.append(","); -} - -ObjectMapper mapper = new ObjectMapper(); -mapper.setSerializationInclusion(Inclusion.NON_EMPTY); // ignore empty fields -result.append(mapper.writeValueAsString(metadata)); -} - -return result.append("]}").toString(); /* append suffix info */ - +return new MetadataResponse(metadataList); } /** * Converts metadata list to a readable string. * Intended for debugging purposes only. */ -private static String metadataToString(List metadataList) { -StringBuilder result = new StringBuilder("Metadata:"); +private static void printMetadata(List metadataList, String path) { +LOG.debug("Metadata List for path " + path + ": "); + +if (null == metadataList || metadataList.isEmpty()) { +LOG.debug("No metadata"); +return; +} for(Metadata metadata: metadataList) { -result.append(" Metadata for item \""); +StringBuilder result = new StringBuilder(); +result.append("Metadata for item \""); --- End diff -- the quotes are not closed for the case of null metadata. perhaps move `\"` to be part of line 79. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-462. Dispatch dfs_address from m...
Github user hornn commented on the pull request: https://github.com/apache/incubator-hawq/pull/503#issuecomment-201108761 I really like this solution - it looks really clean. What happens in the non secured case - do we always pass this var regardless? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Writable security fix
Github user hornn commented on the pull request: https://github.com/apache/incubator-hawq/pull/497#issuecomment-200624680 Not sure I understood everything, so correct me if I am wrong. So right now we have two flows 1. For readable tables we append dfs_address in hd_work_mgr and parse it in pxfuriparser. 2. For writable tables we append dfs_address as part of the URI that is part of the external table context. Then we parse it in gpbridgeapi, but only in the export case (writable). My question is - shouldn't we consolidate the two solutions and adopt the 2nd flow for both readable and writable? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Writable security fix
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/497#discussion_r57267606 --- Diff: src/bin/gpfusion/gpbridgeapi.c --- @@ -273,6 +277,26 @@ void gpbridge_export_start(PG_FUNCTION_ARGS) } +void parse_namespace(char *uri, char** extracted_namespace){ + int i = 0; + int len = strlen(uri); + + while(uri[i] != '*' && i < len){ --- End diff -- Actually, shouldn't we parse the URI as part of the parse function we already have? that way we can know that we parsed everything else before we expect to read the delimiter and the dfs_address - similar to the way it's implemented right now, when we parse dfs_address that was appended by hd_work_mgr. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Writable security fix
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/497#discussion_r57267537 --- Diff: src/backend/cdb/cdbquerycontextdispatching.c --- @@ -616,6 +632,37 @@ RebuildFilesystemCredentials(QueryContextInfo *cxt, HTAB ** currentFilesystemCre pfree(binary); } + +/* + * deserialize the namespace data + */ + +static void +RebuildNamespace(QueryContextInfo *cxt) +{ + + int len; + char buffer[4], *binary; + ReadData(cxt, buffer, sizeof(buffer), TRUE); + + len = (int) ntohl(*(uint32 *) buffer); + binary = palloc(len); + if(ReadData(cxt, binary, len, TRUE)) + { + StringInfoData buffer; + initStringInfoOfString(, binary, len); + + cxt->sharedPath = palloc(buffer.len + 1); + while(buffer.cursor < buffer.len) { --- End diff -- missing space after while. Also I think the palloc + loop can be replaced by a copy function `pnstrdup` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Writable security fix
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/497#discussion_r57267450 --- Diff: src/backend/access/external/hd_work_mgr.c --- @@ -768,6 +769,16 @@ make_allocation_output_string(List *segment_fragments) initStringInfo(); appendStringInfoString(, SEGWORK_PREFIX); + if (enable_secure_filesystem) --- End diff -- if we append the dfs_address through the external framework, why do we need to append it here as well? wouldn't it work for both readable and writable tables? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Writable security fix
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/497#discussion_r57267369 --- Diff: src/backend/access/external/fileam.c --- @@ -681,6 +687,8 @@ external_insert(ExternalInsertDesc extInsertDesc, HeapTuple instup) fcinfo.arg[0] = HeapTupleGetDatum(instup); fcinfo.argnull[0] = false; + fcinfo.arg[1] = "test"; --- End diff -- what does it mean? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Writable security fix
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/497#discussion_r57267295 --- Diff: src/include/access/fileam.h --- @@ -44,12 +44,13 @@ * ExternalInsertDescData is used for storing state related * to inserting data into a writable external table. */ -typedef struct ExternalInsertDescData +typedef struct ExternalInsertDescDat { Relationext_rel; FILE* ext_file; char* ext_uri;/* "command:" or "tablespace:" */ boolext_noop; /* no op. this segdb needs to do nothing (e.g. mirror seg) */ + char* ext_namespace; /* Namespace data */ --- End diff -- indentation looks off --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Writable security fix
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/497#discussion_r57267242 --- Diff: src/bin/gpfusion/gpbridgeapi.c --- @@ -508,17 +532,16 @@ size_t fill_buffer(gphadoop_context* context, char* start, size_t size) void add_delegation_token(PxfInputData *inputData) { PxfHdfsTokenData *token = NULL; - char* dfs_address = NULL; + char *dfs_address = inputData->gphduri->dfs_address; if (!enable_secure_filesystem) return; token = palloc0(sizeof(PxfHdfsTokenData)); - get_hdfs_location_from_filespace(_address); - -elog(DEBUG2, "locating token for %s", dfs_address); +// get_hdfs_location_from_filespace(_address); --- End diff -- remove :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Writable security fix
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/497#discussion_r57267229 --- Diff: src/bin/gpfusion/gpbridgeapi.c --- @@ -273,6 +277,26 @@ void gpbridge_export_start(PG_FUNCTION_ARGS) } +void parse_namespace(char *uri, char** extracted_namespace){ + int i = 0; + int len = strlen(uri); + + while(uri[i] != '*' && i < len){ + i++; + } + + uri[i++] = '\0'; + if(i < len){ + *extracted_namespace = palloc(len - i + 1); + int extracted_namespace_len = len - i; + int j = 0; + while(i < len && j < extracted_namespace_len){ --- End diff -- I think this loop can be replaced by `pnstrdup` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Writable security fix
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/497#discussion_r57267124 --- Diff: src/bin/gpfusion/gpbridgeapi.c --- @@ -273,6 +277,26 @@ void gpbridge_export_start(PG_FUNCTION_ARGS) } +void parse_namespace(char *uri, char** extracted_namespace){ + int i = 0; + int len = strlen(uri); + + while(uri[i] != '*' && i < len){ --- End diff -- `*` can be part of the PXF location (URL), because we support regex matching for file names (e.g. `file*.txt`). If this is going to be a permanent part of the URL (not just for secured env), perhaps we can scan from the end? Another option is to replace the delimiter. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Writable security fix
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/497#discussion_r57266957 --- Diff: src/backend/gpopt/ivy.xml --- @@ -38,7 +38,7 @@ under the License. - + --- End diff -- relevant to this commit? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Writable security fix
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/497#discussion_r57266801 --- Diff: src/backend/cdb/cdbquerycontextdispatching.c --- @@ -616,6 +632,37 @@ RebuildFilesystemCredentials(QueryContextInfo *cxt, HTAB ** currentFilesystemCre pfree(binary); } + +/* + * deserialize the namespace data + */ + --- End diff -- remove new line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Writable security fix
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/497#discussion_r57266715 --- Diff: src/backend/cdb/cdbquerycontextdispatching.c --- @@ -313,6 +313,22 @@ FinalizeQueryContextInfo(QueryContextInfo *cxt) WriteData(cxt, buffer.data, buffer.len); WriteData(cxt, credential, size); } + +} + +//Add Namespace for PXF +if(Gp_role != GP_ROLE_EXECUTE){ + const char *namespace = cxt->sharedPath; + int size = strlen(namespace); --- End diff -- indentation look off... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Writable security fix
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/497#discussion_r57266695 --- Diff: src/backend/cdb/cdbquerycontextdispatching.c --- @@ -313,6 +313,22 @@ FinalizeQueryContextInfo(QueryContextInfo *cxt) WriteData(cxt, buffer.data, buffer.len); WriteData(cxt, credential, size); } + +} + +//Add Namespace for PXF --- End diff -- please change comment to c standard `/* */` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ 459 Enhanced metadata api to sup...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/477#discussion_r57214858 --- Diff: pxf/pxf-hive/src/main/java/org/apache/hawq/pxf/plugins/hive/utilities/HiveUtilities.java --- @@ -203,19 +205,37 @@ private static boolean verifyModifers(String[] modifiers) { * It can be either table_name or db_name.table_name. * * @param qualifiedName Hive table name - * @return {@link org.apache.hawq.pxf.api.Metadata.Table} object holding the full table name + * @return {@link Metadata.Item} object holding the full table name */ -public static Metadata.Table parseTableQualifiedName(String qualifiedName) { +public static Metadata.Item extractTableFromName(String qualifiedName) { +List items = extractTablesFromPattern(null, qualifiedName); +if(items.isEmpty()) { +throw new IllegalArgumentException("No tables found"); --- End diff -- We have code in HAWQ (in libchurl) that already handles exceptions from PXF. It could be improved of course, but we cannot rely solely on json because some exceptions are thrown directly from tomcat, or are a result of connectivity issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-465. Implement stored procedure ...
Github user hornn commented on the pull request: https://github.com/apache/incubator-hawq/pull/479#issuecomment-200136727 @sansanichfb Thank you for the link to the design doc - it made things much more clear. I think the examples in the JIRA need to be updated, though - the pattern and path are combined. Great job, it looks really good! @shivzone Good point regarding the API changes. We didn't consider take releases into account until now when changing the protocol version. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-465. Implement stored procedure ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/479#discussion_r57101750 --- Diff: src/backend/utils/adt/pxf_functions.c --- @@ -0,0 +1,164 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +#include "catalog/external/externalmd.h" +#include "postgres.h" +#include "fmgr.h" +#include "funcapi.h" +#include "utils/builtins.h" + + +typedef struct ObjectContext +{ + ListCell *current_object; + ListCell *current_field; +} ObjectContext; + +ListCell* pxf_object_fields_enum_start(text *profile, text *pattern); +ObjectContext* pxf_object_fields_enum_next(ObjectContext *object_context); +void pxf_object_fields_enum_end(void); + +ListCell* +pxf_object_fields_enum_start(text *profile, text *pattern) +{ + List *objects = NIL; + + char *profile_cstr = text_to_cstring(profile); + char *pattern_cstr = text_to_cstring(pattern); + + objects = get_pxf_object_metadata(profile_cstr, pattern_cstr, NULL); + + return list_head(objects); +} + +ObjectContext* +pxf_object_fields_enum_next(ObjectContext *object_context) +{ + + //first time call + if (object_context->current_object && !object_context->current_field) + object_context->current_field = list_head(((PxfItem *) lfirst(object_context->current_object))->fields); + + //next field for the same object + else if lnext(object_context->current_field) + object_context->current_field = lnext(object_context->current_field); + //next table + else if lnext(object_context->current_object) + { + object_context->current_object = lnext(object_context->current_object); + object_context->current_field = list_head(((PxfItem *) lfirst(object_context->current_object))->fields); + + //no objects, no fields left + } else + object_context = NULL; + + return object_context; +} + +void pxf_object_fields_enum_end(void) +{ + //cleanup +} + +Datum pxf_get_object_fields(PG_FUNCTION_ARGS) +{ + MemoryContext oldcontext; + FuncCallContext *funcctx; + HeapTuple tuple; + Datum result; + Datum values[4]; + bool nulls[4]; + + ObjectContext *object_context; + + text *profile = PG_GETARG_TEXT_P(0); + text *pattern = PG_GETARG_TEXT_P(1); + + /* stuff done only on the first call of the function */ + if (SRF_IS_FIRSTCALL()) + { + TupleDesc tupdesc; + + /* create a function context for cross-call persistence */ + funcctx = SRF_FIRSTCALL_INIT(); + + /* +* switch to memory context appropriate for multiple function calls +*/ + oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx); + + /* initialize object fileds metadata scanning code */ + object_context = (ObjectContext *) palloc0(sizeof(ObjectContext)); + object_context->current_object = pxf_object_fields_enum_start(profile, pattern); + funcctx->user_fctx = (void *) object_context; + + /* +* build tupdesc for result tuples. This must match this function's +* pg_proc entry! +*/ + tupdesc = CreateTemplateTupleDesc(4, false); + TupleDescInitEntry(tupdesc, (AttrNumber) 1, "path", + TEXTOID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 2, "objectname", + TEXTOID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 3, "columnname", + TEXTOID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 4, "columntype",
[GitHub] incubator-hawq pull request: HAWQ-465. Implement stored procedure ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/479#discussion_r57101396 --- Diff: src/include/catalog/namespace.h --- @@ -95,4 +95,6 @@ extern char *namespace_search_path; extern List *fetch_search_path(bool includeImplicit); +#define HiveProfileName "Hive" --- End diff -- then I think it should be part of namespace.c - the header file is used by other files, so if it's not useful for them it shouldn't be there. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-465. Implement stored procedure ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/479#discussion_r57044806 --- Diff: src/test/unit/mock/backend/utils/adt/pxf_functions_mock.c --- @@ -0,0 +1,58 @@ +/* --- End diff -- I don't auto-generated files should be committed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-465. Implement stored procedure ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/479#discussion_r57044714 --- Diff: src/test/regress/output/json_load.source --- @@ -184,7 +184,7 @@ END TRANSACTION; -- negative test: duplicated tables BEGIN TRANSACTION; SELECT load_json_data('@abs_builddir@/data/hcatalog/multi_table_duplicates.json'); -ERROR: relation "hcatalog.db.t" already exists +ERROR: relation "db.t" already exists in namespace with oid=4284481535 --- End diff -- is this oid fixed? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-465. Implement stored procedure ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/479#discussion_r57044521 --- Diff: src/backend/utils/adt/pxf_functions.c --- @@ -0,0 +1,164 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +#include "catalog/external/externalmd.h" +#include "postgres.h" +#include "fmgr.h" +#include "funcapi.h" +#include "utils/builtins.h" + + +typedef struct ObjectContext +{ + ListCell *current_object; + ListCell *current_field; +} ObjectContext; + +ListCell* pxf_object_fields_enum_start(text *profile, text *pattern); +ObjectContext* pxf_object_fields_enum_next(ObjectContext *object_context); +void pxf_object_fields_enum_end(void); + +ListCell* +pxf_object_fields_enum_start(text *profile, text *pattern) +{ + List *objects = NIL; + + char *profile_cstr = text_to_cstring(profile); + char *pattern_cstr = text_to_cstring(pattern); + + objects = get_pxf_object_metadata(profile_cstr, pattern_cstr, NULL); + + return list_head(objects); +} + +ObjectContext* +pxf_object_fields_enum_next(ObjectContext *object_context) +{ + + //first time call + if (object_context->current_object && !object_context->current_field) + object_context->current_field = list_head(((PxfItem *) lfirst(object_context->current_object))->fields); + + //next field for the same object + else if lnext(object_context->current_field) + object_context->current_field = lnext(object_context->current_field); + //next table + else if lnext(object_context->current_object) + { + object_context->current_object = lnext(object_context->current_object); + object_context->current_field = list_head(((PxfItem *) lfirst(object_context->current_object))->fields); + + //no objects, no fields left + } else + object_context = NULL; + + return object_context; +} + +void pxf_object_fields_enum_end(void) +{ + //cleanup +} + +Datum pxf_get_object_fields(PG_FUNCTION_ARGS) +{ + MemoryContext oldcontext; + FuncCallContext *funcctx; + HeapTuple tuple; + Datum result; + Datum values[4]; + bool nulls[4]; + + ObjectContext *object_context; + + text *profile = PG_GETARG_TEXT_P(0); + text *pattern = PG_GETARG_TEXT_P(1); + + /* stuff done only on the first call of the function */ + if (SRF_IS_FIRSTCALL()) + { + TupleDesc tupdesc; + + /* create a function context for cross-call persistence */ + funcctx = SRF_FIRSTCALL_INIT(); + + /* +* switch to memory context appropriate for multiple function calls +*/ + oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx); + + /* initialize object fileds metadata scanning code */ + object_context = (ObjectContext *) palloc0(sizeof(ObjectContext)); + object_context->current_object = pxf_object_fields_enum_start(profile, pattern); + funcctx->user_fctx = (void *) object_context; + + /* +* build tupdesc for result tuples. This must match this function's +* pg_proc entry! +*/ + tupdesc = CreateTemplateTupleDesc(4, false); + TupleDescInitEntry(tupdesc, (AttrNumber) 1, "path", + TEXTOID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 2, "objectname", + TEXTOID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 3, "columnname", + TEXTOID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 4, "columntype",
[GitHub] incubator-hawq pull request: HAWQ-465. Implement stored procedure ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/479#discussion_r57041688 --- Diff: src/backend/utils/adt/pxf_functions.c --- @@ -0,0 +1,164 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +#include "catalog/external/externalmd.h" --- End diff -- please re-org the includes in lexicographic order (but I think postgres.h should be in the beginning) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-465. Implement stored procedure ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/479#discussion_r57040549 --- Diff: src/backend/catalog/external/externalmd.c --- @@ -23,10 +23,12 @@ * Author: antova * * - * Utilities for loading external hcatalog metadata + * Utilities for loading external PXF metadata * */ +#include "catalog/external/externalmd.h" --- End diff -- please move to its right place in the include list (after catalog/catquery.h) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-465. Implement stored procedure ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/479#discussion_r57040438 --- Diff: src/backend/access/external/pxfmasterapi.c --- @@ -258,14 +258,20 @@ ha_failover(GPHDUri *hadoop_uri, } } -/* Concatenate two literal strings using stringinfo */ -char* concat(char *body, char *tail) +/* Concatenate multiple literal strings using stringinfo */ +char* concat(int num_args, ...) --- End diff -- :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ 459 Enhanced metadata api to sup...
Github user hornn commented on the pull request: https://github.com/apache/incubator-hawq/pull/477#issuecomment-199623793 @shivzone Really nice changes! Does it affect the way PXF interacts with HAWQ? IIRC we already assume the returned data will be a list of items, but I think the JSON field names are different now. I think since the JSON answer is different, the API should change. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ 459 Enhanced metadata api to sup...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/477#discussion_r56933654 --- Diff: pxf/pxf-service/src/main/java/org/apache/hawq/pxf/service/utilities/ProtocolData.java --- @@ -95,6 +95,7 @@ public ProtocolData(Map<String, String> paramsMap) { accessor = getProperty("ACCESSOR"); resolver = getProperty("RESOLVER"); fragmenter = getOptionalProperty("FRAGMENTER"); +metadata = getOptionalProperty("METADATA"); --- End diff -- perhaps call it MDFETCHER ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ 459 Enhanced metadata api to sup...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/477#discussion_r56933558 --- Diff: pxf/pxf-service/src/main/java/org/apache/hawq/pxf/service/MetadataResponseFormatter.java --- @@ -55,50 +54,65 @@ public static String formatResponseString(Metadata metadata) throws IOException * To be used as the result string for HAWQ. * An example result is as follows: * - * {"PXFMetadata":[{"table":{"dbName":"default","tableName":"t1"},"fields":[{"name":"a","type":"int"},{"name":"b","type":"float"}]}]} + * {"PXFMetadata":[{"item":{"path":"default","name":"t1"},"fields":[{"name":"a","type":"int"},{"name":"b","type":"float"}]}]} */ -private static String metadataToJSON(Metadata metadata) throws IOException { +private static String metadataToJSON(List metadataList) throws IOException { --- End diff -- Have a look at the way we serialize the Fragments - we had a problem of having too many fragments, so we needed to serialize them inside a streaming object. I think that now that the metadata can include multiple tables, the same approach should be taken here as well. The same applies also for the debug function that prints all of the items - if there are too many of them the StringBuilder will run out of memory. The solution in the fragments case was to print a log of one fragment at a time - perhaps the same should be done here as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ 459 Enhanced metadata api to sup...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/477#discussion_r56933465 --- Diff: pxf/pxf-service/src/main/java/org/apache/hawq/pxf/service/rest/MetadataResource.java --- @@ -56,51 +61,53 @@ public MetadataResource() throws IOException { } /** - * This function queries the HiveMetaStore to get the given table's - * metadata: Table name, field names, field types. The types are converted - * from HCatalog types to HAWQ types. Supported HCatalog types: TINYINT, - * SMALLINT, INT, BIGINT, BOOLEAN, FLOAT, DOUBLE, STRING, BINARY, TIMESTAMP, - * DATE, DECIMAL, VARCHAR, CHAR. + * This function queries the underlying store based on the given profile to get schema for items that match the given pattern + * metadata: Item name, field names, field types. The types are converted + * from the underlying types to HAWQ types. * Unsupported types result in an error. * Response Examples: * For a table default.t1 with 2 fields (a int, b float) will * be returned as: - * {"PXFMetadata":[{"table":{"dbName":"default","tableName":"t1"},"fields":[{"name":"a","type":"int"},{"name":"b","type":"float"}]}]} + * {"PXFMetadata":[{"item":{"path":"default","name":"t1"},"fields":[{"name":"a","type":"int"},{"name":"b","type":"float"}]}]} * * @param servletContext servlet context * @param headers http headers - * @param table HCatalog table name - * @return JSON formatted response with metadata for given table - * @throws Exception if connection to Hcatalog failed, table didn't exist or + * @param profile based on this the metadata source can be inferred + * @param pattern table/file name or pattern in the given source + * @return JSON formatted response with metadata of each item that corresponds to the pattern + * @throws Exception if connection to the source/catalog failed, item didn't exist for the pattern * its type or fields are not supported */ @GET -@Path("getTableMetadata") +@Path("getMetadata") @Produces("application/json") public Response read(@Context final ServletContext servletContext, @Context final HttpHeaders headers, - @QueryParam("table") final String table) + @QueryParam("profile") final String profile, + @QueryParam("pattern") final String pattern) throws Exception { -LOG.debug("getTableMetadata started"); +LOG.debug("getMetadata started"); String jsonOutput; try { + +// Convert headers into a regular map +Map<String, String> params = convertToCaseInsensitiveMap(headers.getRequestHeaders()); + +// Add profile and verify token +params.put("X-GP-PROFILE", profile.toLowerCase()); +ProtocolData protData = new ProtocolData(params); +SecuredHDFS.verifyToken(protData, servletContext); + // 1. start MetadataFetcher -MetadataFetcher metadataFetcher = MetadataFetcherFactory.create("org.apache.hawq.pxf.plugins.hive.HiveMetadataFetcher"); // TODO: - // nhorn - // - - // 09-03-15 - // - - // pass - // as - // param +MetadataFetcher metadataFetcher = MetadataFetcherFactory.create(protData); --- End diff -- aweome! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ 459 Enhanced metadata api to sup...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/477#discussion_r56933378 --- Diff: pxf/pxf-service/src/main/java/org/apache/hawq/pxf/service/MetadataResponseFormatter.java --- @@ -55,50 +54,65 @@ public static String formatResponseString(Metadata metadata) throws IOException * To be used as the result string for HAWQ. * An example result is as follows: * - * {"PXFMetadata":[{"table":{"dbName":"default","tableName":"t1"},"fields":[{"name":"a","type":"int"},{"name":"b","type":"float"}]}]} + * {"PXFMetadata":[{"item":{"path":"default","name":"t1"},"fields":[{"name":"a","type":"int"},{"name":"b","type":"float"}]}]} */ -private static String metadataToJSON(Metadata metadata) throws IOException { +private static String metadataToJSON(List metadataList) throws IOException { -if (metadata == null) { -throw new IllegalArgumentException("metadata object is null - cannot serialize"); +if (metadataList == null || metadataList.isEmpty()) { +throw new IllegalArgumentException("no metadata objects found - cannot serialize"); } -if ((metadata.getFields() == null) || metadata.getFields().isEmpty()) { -throw new IllegalArgumentException("metadata contains no fields - cannot serialize"); +StringBuilder result = null; + +for(Metadata metadata: metadataList) { +if(metadata == null) { +throw new IllegalArgumentException("metadata object is null - cannot serialize"); +} +if ((metadata.getFields() == null) || metadata.getFields().isEmpty()) { +throw new IllegalArgumentException("metadata contains no fields - cannot serialize"); --- End diff -- do you want to specify what is the faulty item or do we not care? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ 459 Enhanced metadata api to sup...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/477#discussion_r5695 --- Diff: pxf/pxf-service/src/main/java/org/apache/hawq/pxf/service/MetadataResponseFormatter.java --- @@ -55,50 +54,65 @@ public static String formatResponseString(Metadata metadata) throws IOException * To be used as the result string for HAWQ. * An example result is as follows: * - * {"PXFMetadata":[{"table":{"dbName":"default","tableName":"t1"},"fields":[{"name":"a","type":"int"},{"name":"b","type":"float"}]}]} + * {"PXFMetadata":[{"item":{"path":"default","name":"t1"},"fields":[{"name":"a","type":"int"},{"name":"b","type":"float"}]}]} --- End diff -- if it's a list of items, should there be more `[` or `{` in the jsob? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ 459 Enhanced metadata api to sup...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/477#discussion_r56933223 --- Diff: pxf/pxf-service/src/main/java/org/apache/hawq/pxf/service/MetadataFetcherFactory.java --- @@ -28,7 +30,12 @@ * abstract class which is returned by the MetadataFetcherFactory. */ public class MetadataFetcherFactory { -static public MetadataFetcher create(String fetcherName) throws Exception { -return (MetadataFetcher) Class.forName(fetcherName).newInstance(); +/* TODO: This is a tempororary workaround. + * The metadata class will be moved to the pxf-profile.xml in the future + */ --- End diff -- btw, why is it temporary? currently, both profile and specific plugin names are supported - so it will probably be possible to define a profile that includes the metadata fetcher, or specify the metadata fetcher itself. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ 459 Enhanced metadata api to sup...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/477#discussion_r56933120 --- Diff: pxf/pxf-service/src/main/java/org/apache/hawq/pxf/service/MetadataFetcherFactory.java --- @@ -28,7 +30,12 @@ * abstract class which is returned by the MetadataFetcherFactory. */ public class MetadataFetcherFactory { -static public MetadataFetcher create(String fetcherName) throws Exception { -return (MetadataFetcher) Class.forName(fetcherName).newInstance(); +/* TODO: This is a tempororary workaround. --- End diff -- typo temporary --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ 459 Enhanced metadata api to sup...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/477#discussion_r56933030 --- Diff: pxf/pxf-hive/src/main/java/org/apache/hawq/pxf/plugins/hive/utilities/HiveUtilities.java --- @@ -225,15 +245,47 @@ private static boolean verifyModifers(String[] modifiers) { } if (toks.size() == 1) { -dbName = HIVE_DEFAULT_DBNAME; -tableName = toks.get(0); +dbPattern = HIVE_DEFAULT_DBNAME; +tablePattern = toks.get(0); } else if (toks.size() == 2) { -dbName = toks.get(0); -tableName = toks.get(1); +dbPattern = toks.get(0); +tablePattern = toks.get(1); } else { -throw new IllegalArgumentException("\"" + qualifiedName + "\"" + errorMsg); +throw new IllegalArgumentException("\"" + pattern + "\"" + errorMsg); +} + +return getTablesFromPattern(client, dbPattern, tablePattern); + +} + +private static List getTablesFromPattern(HiveMetaStoreClient client, String dbPattern, String tablePattern) { + +List databases = null; +List itemList = new ArrayList(); +List tables = new ArrayList(); + +if(client == null || (!dbPattern.contains(WILDCARD) && !tablePattern.contains(WILDCARD)) ) { +/* This case occurs when the call is invoked as part of the fragmenter api or when metadata is requested for a specific table name */ +itemList.add(new Metadata.Item(dbPattern, tablePattern)); +return itemList; } -return new Metadata.Table(dbName, tableName); +try { +databases = client.getDatabases(dbPattern); +if(databases.isEmpty()) { +throw new IllegalArgumentException("No database found for the given pattern"); +} +for(String dbName: databases) { +for(String tableName: client.getTables(dbName, tablePattern)) { --- End diff -- can `getTables` accept as parameters both `dbName` pattern and `tablePattern` or does the `dbName` must be a real db? If it's possible, perhaps we can get the list in one call. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ 459 Enhanced metadata api to sup...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/477#discussion_r56932987 --- Diff: pxf/pxf-hive/src/main/java/org/apache/hawq/pxf/plugins/hive/utilities/HiveUtilities.java --- @@ -225,15 +245,47 @@ private static boolean verifyModifers(String[] modifiers) { } if (toks.size() == 1) { -dbName = HIVE_DEFAULT_DBNAME; -tableName = toks.get(0); +dbPattern = HIVE_DEFAULT_DBNAME; +tablePattern = toks.get(0); } else if (toks.size() == 2) { -dbName = toks.get(0); -tableName = toks.get(1); +dbPattern = toks.get(0); +tablePattern = toks.get(1); } else { -throw new IllegalArgumentException("\"" + qualifiedName + "\"" + errorMsg); +throw new IllegalArgumentException("\"" + pattern + "\"" + errorMsg); +} + +return getTablesFromPattern(client, dbPattern, tablePattern); + +} + +private static List getTablesFromPattern(HiveMetaStoreClient client, String dbPattern, String tablePattern) { + +List databases = null; +List itemList = new ArrayList(); +List tables = new ArrayList(); + +if(client == null || (!dbPattern.contains(WILDCARD) && !tablePattern.contains(WILDCARD)) ) { +/* This case occurs when the call is invoked as part of the fragmenter api or when metadata is requested for a specific table name */ +itemList.add(new Metadata.Item(dbPattern, tablePattern)); +return itemList; } -return new Metadata.Table(dbName, tableName); +try { +databases = client.getDatabases(dbPattern); +if(databases.isEmpty()) { +throw new IllegalArgumentException("No database found for the given pattern"); +} +for(String dbName: databases) { +for(String tableName: client.getTables(dbName, tablePattern)) { +if (!tableName.isEmpty()) { --- End diff -- why do you need to check for an empty string? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ 459 Enhanced metadata api to sup...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/477#discussion_r56932849 --- Diff: pxf/pxf-hive/src/main/java/org/apache/hawq/pxf/plugins/hive/utilities/HiveUtilities.java --- @@ -225,15 +245,47 @@ private static boolean verifyModifers(String[] modifiers) { } if (toks.size() == 1) { -dbName = HIVE_DEFAULT_DBNAME; -tableName = toks.get(0); +dbPattern = HIVE_DEFAULT_DBNAME; +tablePattern = toks.get(0); } else if (toks.size() == 2) { -dbName = toks.get(0); -tableName = toks.get(1); +dbPattern = toks.get(0); +tablePattern = toks.get(1); } else { -throw new IllegalArgumentException("\"" + qualifiedName + "\"" + errorMsg); +throw new IllegalArgumentException("\"" + pattern + "\"" + errorMsg); +} + +return getTablesFromPattern(client, dbPattern, tablePattern); + --- End diff -- remove extra line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ 459 Enhanced metadata api to sup...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/477#discussion_r56932656 --- Diff: pxf/pxf-hive/src/main/java/org/apache/hawq/pxf/plugins/hive/utilities/HiveUtilities.java --- @@ -203,19 +205,37 @@ private static boolean verifyModifers(String[] modifiers) { * It can be either table_name or db_name.table_name. * * @param qualifiedName Hive table name - * @return {@link org.apache.hawq.pxf.api.Metadata.Table} object holding the full table name + * @return {@link Metadata.Item} object holding the full table name */ -public static Metadata.Table parseTableQualifiedName(String qualifiedName) { +public static Metadata.Item extractTableFromName(String qualifiedName) { +List items = extractTablesFromPattern(null, qualifiedName); +if(items.isEmpty()) { +throw new IllegalArgumentException("No tables found"); +} +return items.get(0); +} -String dbName, tableName; +/** + * Extracts the db_name and table_name from the qualifiedName. --- End diff -- qualifiedName -> pattern? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ 459 Enhanced metadata api to sup...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/477#discussion_r56932687 --- Diff: pxf/pxf-hive/src/main/java/org/apache/hawq/pxf/plugins/hive/utilities/HiveUtilities.java --- @@ -203,19 +205,37 @@ private static boolean verifyModifers(String[] modifiers) { * It can be either table_name or db_name.table_name. * * @param qualifiedName Hive table name - * @return {@link org.apache.hawq.pxf.api.Metadata.Table} object holding the full table name + * @return {@link Metadata.Item} object holding the full table name */ -public static Metadata.Table parseTableQualifiedName(String qualifiedName) { +public static Metadata.Item extractTableFromName(String qualifiedName) { +List items = extractTablesFromPattern(null, qualifiedName); +if(items.isEmpty()) { +throw new IllegalArgumentException("No tables found"); +} +return items.get(0); +} -String dbName, tableName; +/** + * Extracts the db_name and table_name from the qualifiedName. + * qualifiedName is the Hive table name or pattern that the user enters in the CREATE EXTERNAL TABLE statement + * or when querying HCatalog table. + * It can be either table_name_pattern or db_name_pattern.table_name_pattern. + * + * @param client Hivemetastore client + * @param pattern Hive table name or pattern + * @return {@link Metadata.Item} object holding the full table name --- End diff -- returns list, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ 459 Enhanced metadata api to sup...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/477#discussion_r56932473 --- Diff: pxf/pxf-hive/src/main/java/org/apache/hawq/pxf/plugins/hive/HiveDataFragmenter.java --- @@ -251,11 +251,11 @@ private void fetchTableMetaData(Metadata.Table tblDesc) throws Exception { StorageDescriptor descPartition = partition.getSd(); props = MetaStoreUtils.getSchema(descPartition, descTable, null, // Map<string, string> parameters - can be empty -tblDesc.getDbName(), tblDesc.getTableName(), // table +tblDesc.getPath(), tblDesc.getName(), // table --- End diff -- fix comment? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ 459 Enhanced metadata api to sup...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/477#discussion_r56932413 --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/utilities/InputData.java --- @@ -46,6 +46,7 @@ protected String accessor; protected String resolver; protected String fragmenter; +protected String metadata; --- End diff -- since this is actually the MetadataFetcher class name, perhaps rename metadataFetcher? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ 459 Enhanced metadata api to sup...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/477#discussion_r56932345 --- Diff: pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/MetadataFetcher.java --- @@ -19,27 +19,33 @@ * under the License. */ +import java.util.List; + +import org.apache.hawq.pxf.api.utilities.InputData; +import org.apache.hawq.pxf.api.utilities.Plugin; + /** - * Abstract class that defines getting metadata of a table. + * Abstract class that defines getting metadata of an item. --- End diff -- an item -> items :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Fixing error message for exttab1 sour...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/433#discussion_r55791714 --- Diff: src/test/regress/output/exttab1.source --- @@ -669,7 +669,7 @@ HINT: use the gpfdist protocol or COPY FROM instead create writable external table wet_neg1(a text, b text) location('gpfdist://@hostname@:7070/wet.out', 'gpfdist://@hostname@:7070/wet.out') format 'text'; ERROR: location uri "gpfdist://@hostname@:7070/wet.out" appears more than once create writable external web table wet_pos5(a text, b text) execute 'some command' on segment 0 format 'text'; -ERROR: ON clause may not be used with a writable external table +ERROR: the ON segment syntax for writableexternal tables is deprecated --- End diff -- just looks like a typo: `writableexternal` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Fixing error message for exttab1 sour...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/433#discussion_r55789501 --- Diff: src/test/regress/output/exttab1.source --- @@ -669,7 +669,7 @@ HINT: use the gpfdist protocol or COPY FROM instead create writable external table wet_neg1(a text, b text) location('gpfdist://@hostname@:7070/wet.out', 'gpfdist://@hostname@:7070/wet.out') format 'text'; ERROR: location uri "gpfdist://@hostname@:7070/wet.out" appears more than once create writable external web table wet_pos5(a text, b text) execute 'some command' on segment 0 format 'text'; -ERROR: ON clause may not be used with a writable external table +ERROR: the ON segment syntax for writableexternal tables is deprecated --- End diff -- should there be so many spaces in the middle of the error? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-462. Adding dfs_address from pgf...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/412#discussion_r55459533 --- Diff: src/backend/access/external/pxfuriparser.c --- @@ -634,9 +638,20 @@ GPHDUri_parse_segwork(GPHDUri *uri, const char *uri_str) segwork = strstr(uri_str, segwork_substring); if (segwork == NULL) return; - segwork += strlen(segwork_substring); + if (enable_secure_filesystem) + { + /* parse dfs address */ + size_end = strchr(segwork, segwork_dfs_separator); + if(size_end != NULL) --- End diff -- if it's a secured environment, if there is no dfs address data maybe we should error out. What do you think? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-462. Adding dfs_address from pgf...
Github user hornn commented on the pull request: https://github.com/apache/incubator-hawq/pull/412#issuecomment-194056909 Looks good. Do you plan to open a separate story for writable? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-462. Adding dfs_address from pgf...
Github user hornn commented on the pull request: https://github.com/apache/incubator-hawq/pull/412#issuecomment-192448360 General question - does it also solve the problem for writable? I thought for writable tables the code path didn't pass through hd_work_mgr. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-462. Adding dfs_address from pgf...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/412#discussion_r55083101 --- Diff: src/backend/access/external/pxfuriparser.c --- @@ -719,7 +727,7 @@ GPHDUri_parse_fragment(char* fragment, List* fragments) has_user_data = true; *value_end = '\0'; } - fragment_data->fragment_md = pstrdup(value_start); + fragment_data->dfs_address = pstrdup(value_start); --- End diff -- please make sure to pfree it in GPHDUri_free_fragments. I also noticed that we don't free fragment_md and user_data. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-462. Adding dfs_address from pgf...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/412#discussion_r55082603 --- Diff: src/backend/access/external/hd_work_mgr.c --- @@ -783,6 +786,10 @@ make_allocation_output_string(List *segment_fragments) appendStringInfoChar(_str, SEGWORK_IN_PAIR_DELIM); if (frag->fragment_md) appendStringInfo(_str, "%s", frag->fragment_md); + /* Adding dfs_address from pg_filespace entry required for HAWQ-462 */ + appendStringInfoChar(_str, SEGWORK_IN_PAIR_DELIM); + if (dfs_address) --- End diff -- why do we need to pass this value for each fragment? wouldn't it be better to pass it just once, perhaps append it in the end of the segments string (after the loop)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request:
Github user hornn commented on the pull request: https://github.com/apache/incubator-hawq/commit/495bb2a2b264da533be237d312507d5e97faec53#commitcomment-16435206 In src/bin/gpfusion/gpbridgeapi.c: In src/bin/gpfusion/gpbridgeapi.c on line 518: I don't know if the filespace info is available from the segment or not - but from your description it sounds like it's not. In that case, I think the solution might be to propagate the dfs_url from the master to the segment as part of the external scan info. For readable table it should be easy - the data can be added in hd_work_mgr, together with the fragments info. For writable tables it's more tricky, because there is no pxf-specific code that is called in the master for writable tables. So I think a better solution would be to always make this info available to external protocol code, by making it part of the external scan, or perhaps populating a global variable when the segment starts. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Hawq 317
Github user hornn closed the pull request at: https://github.com/apache/incubator-hawq/pull/279 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Karthifoyzur hawq 368
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/375#discussion_r53857118 --- Diff: src/test/regress/pg_regress_main.c --- @@ -108,12 +108,7 @@ static void psql_init(void) { /* set default regression database name */ - if (!dblist) --- End diff -- why is the if check not needed anymore? can't psql_init be called a second time when dblist is not null? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Karthifoyzur hawq 368
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/375#discussion_r53856594 --- Diff: src/backend/tcop/postgres.c --- @@ -4652,7 +4652,22 @@ PostgresMain(int argc, char *argv[], const char *username) } IdleTracker_DeactivateProcess(); - firstchar = ReadCommand(_message); + /* +* During read command, if we have any exception we want to reactivate the process +* before falling back to postgres main exception handler. +*/ + PG_TRY(); + { + firstchar = ReadCommand(_message); + } + PG_CATCH(); + { + IdleTracker_ActivateProcess(); + elog(LOG, "Caught exception while reading command"); + PG_RE_THROW(); + } +PG_END_TRY(); --- End diff -- fix indentation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on the pull request: https://github.com/apache/incubator-hawq/pull/302#issuecomment-185331246 There are also a few minor fixes to be merged: https://github.com/tzolov/incubator-hawq/pull/1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: Hawq 317
Github user hornn commented on the pull request: https://github.com/apache/incubator-hawq/pull/279#issuecomment-185330578 Awaiting verification on Isilon cluster. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-405. Fix for handling of pg_temp...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/338#discussion_r52531944 --- Diff: src/backend/catalog/namespace.c --- @@ -1495,18 +1495,34 @@ LookupInternalNamespaceId(const char *nspname) Oid LookupNamespaceId(const char *nspname, Oid dboid) { -if(gp_upgrade_mode){ + + /* check for pg_temp alias */ + if (NSPDBOID_CURRENT == dboid && strcmp(nspname, "pg_temp") == 0) + { + if (TempNamespaceValid(true)) + return myTempNamespace; + /* +* Since this is used only for looking up existing objects, there +* is no point in trying to initialize the temp namespace here; +* and doing so might create problems for some callers. +* Just fall through and give the "does not exist" error. +*/ + } + +if(gp_upgrade_mode) +{ // This code is only need at hawq2.0 upgrade, in this case the old pg_namespace doesn't // have column nspdboid, so we report error in else clause -if(ObjectIdGetDatum(dboid)==NSPDBOID_CURRENT){ +if(ObjectIdGetDatum(dboid)==NSPDBOID_CURRENT) +{ return caql_getoid( NULL, cql("SELECT oid FROM pg_namespace " " WHERE nspname = :1", CStringGetDatum((char *) nspname))); -}else{ -elog(ERROR, "Upgrade cann't process the namespace: %s in dbid: %i", nspname, dboid); } +else + elog(ERROR, "Upgrade cann't process the namespace: %s in dbid: %i", nspname, dboid); --- End diff -- typo cannot --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-405. Fix for handling of pg_temp...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/338#discussion_r52534496 --- Diff: src/backend/catalog/namespace.c --- @@ -1495,18 +1495,34 @@ LookupInternalNamespaceId(const char *nspname) Oid LookupNamespaceId(const char *nspname, Oid dboid) { -if(gp_upgrade_mode){ + + /* check for pg_temp alias */ + if (NSPDBOID_CURRENT == dboid && strcmp(nspname, "pg_temp") == 0) + { + if (TempNamespaceValid(true)) + return myTempNamespace; + /* +* Since this is used only for looking up existing objects, there +* is no point in trying to initialize the temp namespace here; +* and doing so might create problems for some callers. +* Just fall through and give the "does not exist" error. +*/ + } + +if(gp_upgrade_mode) --- End diff -- It should not be part of this commit though. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-405. Fix for handling of pg_temp...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/338#discussion_r52548075 --- Diff: src/test/regress/expected/temp.out --- @@ -170,3 +170,19 @@ select * from whereami; -- you can invoke a temp function explicitly, though -- select pg_temp.whoami(); drop table public.whereami; +create temp table temptest (row integer, count integer); --- End diff -- I think it would be better to have the exact scenario that failed: ``` CREATE TABLE pg_temp.temptest ... SELECT avg(pg_temp.temptest.count) ... ``` Do you think we should cover all three cases (db.table.col, table.col, col)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/302#discussion_r52079820 --- Diff: pxf/pxf-json/src/main/java/org/apache/hawq/pxf/plugins/json/JsonResolver.java --- @@ -0,0 +1,252 @@ +package org.apache.hawq.pxf.plugins.json; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Iterator; +import java.util.List; + +import org.apache.commons.logging.Log; +import org.apache.commons.logging.LogFactory; +import org.apache.hawq.pxf.api.OneField; +import org.apache.hawq.pxf.api.OneRow; +import org.apache.hawq.pxf.api.ReadResolver; +import org.apache.hawq.pxf.api.io.DataType; +import org.apache.hawq.pxf.api.utilities.ColumnDescriptor; +import org.apache.hawq.pxf.api.utilities.InputData; +import org.apache.hawq.pxf.api.utilities.Plugin; +import org.codehaus.jackson.JsonFactory; +import org.codehaus.jackson.JsonNode; +import org.codehaus.jackson.map.ObjectMapper; + +/** + * This JSON resolver for PXF will decode a given object from the {@link JsonAccessor} into a row for HAWQ. It will + * decode this data into a JsonNode and walk the tree for each column. It supports normal value mapping via projections + * and JSON array indexing. + */ +public class JsonResolver extends Plugin implements ReadResolver { + + private static final Log LOG = LogFactory.getLog(JsonResolver.class); + + private ArrayList oneFieldList; + private ColumnDescriptorCache[] columnDescriptorCache; + private ObjectMapper mapper; + + /** +* Row with empty fields. Returned in case of broken or malformed json records. +*/ + private final List emptyRow; + + public JsonResolver(InputData inputData) throws Exception { + super(inputData); + oneFieldList = new ArrayList(); + mapper = new ObjectMapper(new JsonFactory()); + + // Precompute the column metadata. The metadata is used for mapping column names to json nodes. + columnDescriptorCache = new ColumnDescriptorCache[inputData.getColumns()]; + for (int i = 0; i < inputData.getColumns(); ++i) { + ColumnDescriptor cd = inputData.getColumn(i); + columnDescriptorCache[i] = new ColumnDescriptorCache(cd); + } + + emptyRow = createEmptyRow(); + } + + @Override + public List getFields(OneRow row) throws Exception { + oneFieldList.clear(); + + String jsonRecordAsText = row.getData().toString(); + + JsonNode root = decodeLineToJsonNode(jsonRecordAsText); + + if (root == null) { + LOG.warn("Return empty-fields row due to invalid JSON: " + jsonRecordAsText); + return emptyRow; + } + + // Iterate through the column definition and fetch our JSON data + for (ColumnDescriptorCache columnMetadata : columnDescriptorCache) { + + JsonNode node = getChildJsonNode(root, columnMetadata.getNormalizedProjections()); + + // If this node is null or missing, add a null value here + if (node == null || node.isMissingNode()) { + addNullField(columnMetadata.getColumnType()); + } else if (columnMetadata.isArray()) { + // If this column is an array index, ex. "tweet.hashtags[0]" + if (node.isArray()) { + // If the JSON node is an array, then add it to our list + addFieldFromJsonArray(columnMetadata.getColumnType(), node, columnMetadata.getArrayNodeIndex()); + } else { +
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/302#discussion_r51906457 --- Diff: pxf/pxf-json/src/test/java/org/apache/hawq/pxf/plugins/json/PxfUnit.java --- @@ -0,0 +1,672 @@ +package org.apache.hawq.pxf.plugins.json; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.io.BufferedReader; +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.io.InputStreamReader; +import java.io.OutputStream; +import java.lang.reflect.Constructor; +import java.security.InvalidParameterException; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; + +import org.apache.commons.logging.Log; +import org.apache.commons.logging.LogFactory; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hawq.pxf.api.Fragment; +import org.apache.hawq.pxf.api.Fragmenter; +import org.apache.hawq.pxf.api.OneField; +import org.apache.hawq.pxf.api.OneRow; +import org.apache.hawq.pxf.api.ReadAccessor; +import org.apache.hawq.pxf.api.ReadResolver; +import org.apache.hawq.pxf.api.WriteAccessor; +import org.apache.hawq.pxf.api.WriteResolver; +import org.apache.hawq.pxf.api.io.DataType; +import org.apache.hawq.pxf.api.utilities.InputData; +import org.apache.hawq.pxf.service.FragmentsResponse; --- End diff -- Let's leave it as a follow up. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/302#discussion_r51818643 --- Diff: pxf/pxf-json/src/main/java/org/apache/hawq/pxf/plugins/json/ColumnDescriptorCache.java --- @@ -0,0 +1,165 @@ +package org.apache.hawq.pxf.plugins.json; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +import org.apache.hawq.pxf.api.io.DataType; +import org.apache.hawq.pxf.api.utilities.ColumnDescriptor; + +/** + * Helper class used to retrieve all column details relevant for the json processing. + */ +public class ColumnDescriptorCache { + + private static Pattern ARRAY_PROJECTION_PATTERN = Pattern.compile("(.*)\\[([0-9]+)\\]"); + private static int ARRAY_NAME_GROUPID = 1; + private static int ARRAY_INDEX_GROUPID = 2; + + private final DataType columnType; + private final String[] projections; + private final String nodeName; + private final int arrayIndex; + private final boolean isArray; + + public ColumnDescriptorCache(ColumnDescriptor columnDescriptor) { + + this.columnType = DataType.get(columnDescriptor.columnTypeCode()); + + this.projections = columnDescriptor.columnName().split("\\."); + + this.isArray = isArrayIndex(projections); --- End diff -- slight improvement - if we encapsulate the whole array logic in a function, we can call `getMatchGroup` once, and use the results to populate this.isArray, this.nodeName and this.arrayIndex. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/302#discussion_r51818210 --- Diff: pxf/pxf-json/src/main/java/org/apache/hawq/pxf/plugins/json/ColumnDescriptorCache.java --- @@ -0,0 +1,165 @@ +package org.apache.hawq.pxf.plugins.json; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +import org.apache.hawq.pxf.api.io.DataType; +import org.apache.hawq.pxf.api.utilities.ColumnDescriptor; + +/** + * Helper class used to retrieve all column details relevant for the json processing. + */ +public class ColumnDescriptorCache { + + private static Pattern ARRAY_PROJECTION_PATTERN = Pattern.compile("(.*)\\[([0-9]+)\\]"); --- End diff -- just to be safe, I think the first group should be `.+` and not `.*`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/302#discussion_r51833835 --- Diff: pxf/pxf-json/src/main/java/org/apache/hawq/pxf/plugins/json/parser/PartitionedJsonParser.java --- @@ -0,0 +1,198 @@ +package org.apache.hawq.pxf.plugins.json.parser; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.io.IOException; +import java.io.InputStream; +import java.io.InputStreamReader; +import java.nio.charset.StandardCharsets; +import java.util.ArrayList; +import java.util.EnumSet; +import java.util.List; + +import org.apache.hawq.pxf.plugins.json.parser.JsonLexer.JsonLexerState; + +/** + * A simple parser that can support reading JSON objects from a random point in JSON text. It reads from the supplied + * stream (which is assumed to be positioned at any arbitrary position inside some JSON text) until it find the first --- End diff -- typo finds --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/302#discussion_r51834024 --- Diff: pxf/pxf-json/src/main/java/org/apache/hawq/pxf/plugins/json/parser/PartitionedJsonParser.java --- @@ -0,0 +1,198 @@ +package org.apache.hawq.pxf.plugins.json.parser; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.io.IOException; +import java.io.InputStream; +import java.io.InputStreamReader; +import java.nio.charset.StandardCharsets; +import java.util.ArrayList; +import java.util.EnumSet; +import java.util.List; + +import org.apache.hawq.pxf.plugins.json.parser.JsonLexer.JsonLexerState; + +/** + * A simple parser that can support reading JSON objects from a random point in JSON text. It reads from the supplied + * stream (which is assumed to be positioned at any arbitrary position inside some JSON text) until it find the first + * JSON begin-object "{". From this point on it will keep reading JSON objects until it finds one containing a member + * string that the user supplies. + * + * It is not recommended to use this with JSON text where individual JSON objects that can be large (MB's or larger). + */ +public class PartitionedJsonParser { + + private static final int EOF = -1; + private final InputStreamReader inputStreamReader; + private final JsonLexer lexer; + private long bytesRead = 0; + private boolean endOfStream; + + public PartitionedJsonParser(InputStream is) { + this.lexer = new JsonLexer(); + + // You need to wrap the InputStream with an InputStreamReader, so that it can encode the incoming byte stream as + // UTF-8 characters + this.inputStreamReader = new InputStreamReader(is, StandardCharsets.UTF_8); + } + + private boolean scanToFirstBeginObject() throws IOException { + // seek until we hit the first begin-object + char prev = ' '; + int i; + while ((i = inputStreamReader.read()) != EOF) { + char c = (char) i; + bytesRead++; + if (c == '{' && prev != '\\') { --- End diff -- `{` and `\\` should probably be consts --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/302#discussion_r51835615 --- Diff: pxf/pxf-json/src/test/java/org/apache/hawq/pxf/plugins/json/JsonExtensionTest.java --- @@ -0,0 +1,288 @@ +package org.apache.hawq.pxf.plugins.json; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.io.File; +import java.util.ArrayList; +import java.util.List; + +import org.apache.hadoop.fs.Path; +import org.apache.hawq.pxf.api.Fragmenter; +import org.apache.hawq.pxf.api.ReadAccessor; +import org.apache.hawq.pxf.api.ReadResolver; +import org.apache.hawq.pxf.api.io.DataType; +import org.apache.hawq.pxf.plugins.hdfs.HdfsDataFragmenter; +import org.apache.hawq.pxf.plugins.json.JsonAccessor; +import org.apache.hawq.pxf.plugins.json.JsonResolver; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +public class JsonExtensionTest extends PxfUnit { + + private List<Pair<String, DataType>> columnDefs = null; + private List<Pair<String, String>> extraParams = new ArrayList<Pair<String, String>>(); + + @Before + public void before() { + + columnDefs = new ArrayList<Pair<String, DataType>>(); + + columnDefs.add(new Pair<String, DataType>("created_at", DataType.TEXT)); + columnDefs.add(new Pair<String, DataType>("id", DataType.BIGINT)); + columnDefs.add(new Pair<String, DataType>("text", DataType.TEXT)); + columnDefs.add(new Pair<String, DataType>("user.screen_name", DataType.TEXT)); + columnDefs.add(new Pair<String, DataType>("entities.hashtags[0]", DataType.TEXT)); + columnDefs.add(new Pair<String, DataType>("coordinates.coordinates[0]", DataType.FLOAT8)); + columnDefs.add(new Pair<String, DataType>("coordinates.coordinates[1]", DataType.FLOAT8)); + } + + @After + public void cleanup() throws Exception { + columnDefs.clear(); + extraParams.clear(); + } + + @Test + public void testCompressedMultilineJsonFile() throws Exception { + + extraParams.add(new Pair<String, String>("IDENTIFIER", "created_at")); + + List output = new ArrayList(); + + output.add("Fri Jun 07 22:45:02 + 2013,343136547115253761,text1,SpreadButter,tweetCongress,,"); + output.add("Fri Jun 07 22:45:02 + 2013,343136547123646465,text2,patronusdeadly,,,"); + output.add("Fri Jun 07 22:45:02 + 2013,343136547136233472,text3,NoSecrets_Vagas,,,"); + + super.assertOutput(new Path(System.getProperty("user.dir") + File.separator + + "src/test/resources/tweets.tar.gz"), output); + } + + @Test + public void testMaxRecordLength() throws Exception { + + // variable-size-objects.json contains 3 json objects but only 2 of them fit in the 27 byte length limitation + + extraParams.add(new Pair<String, String>("IDENTIFIER", "key666")); + extraParams.add(new Pair<String, String>("MAXLENGTH", "27")); + + columnDefs.clear(); + columnDefs.add(new Pair<String, DataType>("key666", DataType.TEXT)); + + List output = new ArrayList(); + + output.add("small object1"); + // skip the large object2 + output.add("small object3"); + + super.assertOutput(new Path(System.getProperty("user.dir"
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/302#discussion_r51836013 --- Diff: pxf/pxf-json/src/test/java/org/apache/hawq/pxf/plugins/json/PxfUnit.java --- @@ -0,0 +1,672 @@ +package org.apache.hawq.pxf.plugins.json; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.io.BufferedReader; +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.io.InputStreamReader; +import java.io.OutputStream; +import java.lang.reflect.Constructor; +import java.security.InvalidParameterException; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; + +import org.apache.commons.logging.Log; +import org.apache.commons.logging.LogFactory; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hawq.pxf.api.Fragment; +import org.apache.hawq.pxf.api.Fragmenter; +import org.apache.hawq.pxf.api.OneField; +import org.apache.hawq.pxf.api.OneRow; +import org.apache.hawq.pxf.api.ReadAccessor; +import org.apache.hawq.pxf.api.ReadResolver; +import org.apache.hawq.pxf.api.WriteAccessor; +import org.apache.hawq.pxf.api.WriteResolver; +import org.apache.hawq.pxf.api.io.DataType; +import org.apache.hawq.pxf.api.utilities.InputData; +import org.apache.hawq.pxf.service.FragmentsResponse; --- End diff -- This test file breaks the convention of separating between pxf-service and the plugins. Only pxf-api is supposed to be used by the plugins. I think it's a really good test mechanism, and we should use it, but it definitely should be moved to another location - perhaps its own package, perhaps under pxf-service. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on the pull request: https://github.com/apache/incubator-hawq/pull/302#issuecomment-179682943 @tzolov regarding test coverage, what you did is very impressive. Normally we have an automation suite where we test end to end query using a specific plugin, but I really like the idea of PxfUnit. I think after your commit is done we'll also have to add automation tests, but they will probably not be as exhaustive as what you already did. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/302#discussion_r51834454 --- Diff: pxf/pxf-json/src/main/java/org/apache/hawq/pxf/plugins/json/parser/PartitionedJsonParser.java --- @@ -0,0 +1,198 @@ +package org.apache.hawq.pxf.plugins.json.parser; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.io.IOException; +import java.io.InputStream; +import java.io.InputStreamReader; +import java.nio.charset.StandardCharsets; +import java.util.ArrayList; +import java.util.EnumSet; +import java.util.List; + +import org.apache.hawq.pxf.plugins.json.parser.JsonLexer.JsonLexerState; + +/** + * A simple parser that can support reading JSON objects from a random point in JSON text. It reads from the supplied + * stream (which is assumed to be positioned at any arbitrary position inside some JSON text) until it find the first + * JSON begin-object "{". From this point on it will keep reading JSON objects until it finds one containing a member + * string that the user supplies. + * + * It is not recommended to use this with JSON text where individual JSON objects that can be large (MB's or larger). + */ +public class PartitionedJsonParser { + + private static final int EOF = -1; + private final InputStreamReader inputStreamReader; + private final JsonLexer lexer; + private long bytesRead = 0; + private boolean endOfStream; --- End diff -- for clarify, perhaps explicitly assign `false` to it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/302#discussion_r51835076 --- Diff: pxf/pxf-json/src/test/java/org/apache/hawq/pxf/plugins/json/JsonExtensionTest.java --- @@ -0,0 +1,288 @@ +package org.apache.hawq.pxf.plugins.json; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.io.File; +import java.util.ArrayList; +import java.util.List; + +import org.apache.hadoop.fs.Path; +import org.apache.hawq.pxf.api.Fragmenter; +import org.apache.hawq.pxf.api.ReadAccessor; +import org.apache.hawq.pxf.api.ReadResolver; +import org.apache.hawq.pxf.api.io.DataType; +import org.apache.hawq.pxf.plugins.hdfs.HdfsDataFragmenter; +import org.apache.hawq.pxf.plugins.json.JsonAccessor; +import org.apache.hawq.pxf.plugins.json.JsonResolver; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +public class JsonExtensionTest extends PxfUnit { + + private List<Pair<String, DataType>> columnDefs = null; + private List<Pair<String, String>> extraParams = new ArrayList<Pair<String, String>>(); + + @Before + public void before() { + + columnDefs = new ArrayList<Pair<String, DataType>>(); + + columnDefs.add(new Pair<String, DataType>("created_at", DataType.TEXT)); + columnDefs.add(new Pair<String, DataType>("id", DataType.BIGINT)); + columnDefs.add(new Pair<String, DataType>("text", DataType.TEXT)); + columnDefs.add(new Pair<String, DataType>("user.screen_name", DataType.TEXT)); + columnDefs.add(new Pair<String, DataType>("entities.hashtags[0]", DataType.TEXT)); + columnDefs.add(new Pair<String, DataType>("coordinates.coordinates[0]", DataType.FLOAT8)); + columnDefs.add(new Pair<String, DataType>("coordinates.coordinates[1]", DataType.FLOAT8)); + } + + @After + public void cleanup() throws Exception { + columnDefs.clear(); + extraParams.clear(); + } + + @Test + public void testCompressedMultilineJsonFile() throws Exception { + + extraParams.add(new Pair<String, String>("IDENTIFIER", "created_at")); + + List output = new ArrayList(); + + output.add("Fri Jun 07 22:45:02 + 2013,343136547115253761,text1,SpreadButter,tweetCongress,,"); + output.add("Fri Jun 07 22:45:02 + 2013,343136547123646465,text2,patronusdeadly,,,"); + output.add("Fri Jun 07 22:45:02 + 2013,343136547136233472,text3,NoSecrets_Vagas,,,"); + + super.assertOutput(new Path(System.getProperty("user.dir") + File.separator + + "src/test/resources/tweets.tar.gz"), output); + } + + @Test + public void testMaxRecordLength() throws Exception { + + // variable-size-objects.json contains 3 json objects but only 2 of them fit in the 27 byte length limitation + + extraParams.add(new Pair<String, String>("IDENTIFIER", "key666")); + extraParams.add(new Pair<String, String>("MAXLENGTH", "27")); + + columnDefs.clear(); + columnDefs.add(new Pair<String, DataType>("key666", DataType.TEXT)); + + List output = new ArrayList(); + + output.add("small object1"); + // skip the large object2 + output.add("small object3"); + + super.assertOutput(new Path(System.getProperty("user.dir"
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/302#discussion_r51835264 --- Diff: pxf/pxf-json/src/test/java/org/apache/hawq/pxf/plugins/json/JsonExtensionTest.java --- @@ -0,0 +1,288 @@ +package org.apache.hawq.pxf.plugins.json; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.io.File; +import java.util.ArrayList; +import java.util.List; + +import org.apache.hadoop.fs.Path; +import org.apache.hawq.pxf.api.Fragmenter; +import org.apache.hawq.pxf.api.ReadAccessor; +import org.apache.hawq.pxf.api.ReadResolver; +import org.apache.hawq.pxf.api.io.DataType; +import org.apache.hawq.pxf.plugins.hdfs.HdfsDataFragmenter; +import org.apache.hawq.pxf.plugins.json.JsonAccessor; +import org.apache.hawq.pxf.plugins.json.JsonResolver; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +public class JsonExtensionTest extends PxfUnit { + + private List<Pair<String, DataType>> columnDefs = null; + private List<Pair<String, String>> extraParams = new ArrayList<Pair<String, String>>(); + + @Before + public void before() { + + columnDefs = new ArrayList<Pair<String, DataType>>(); + + columnDefs.add(new Pair<String, DataType>("created_at", DataType.TEXT)); + columnDefs.add(new Pair<String, DataType>("id", DataType.BIGINT)); + columnDefs.add(new Pair<String, DataType>("text", DataType.TEXT)); + columnDefs.add(new Pair<String, DataType>("user.screen_name", DataType.TEXT)); + columnDefs.add(new Pair<String, DataType>("entities.hashtags[0]", DataType.TEXT)); + columnDefs.add(new Pair<String, DataType>("coordinates.coordinates[0]", DataType.FLOAT8)); + columnDefs.add(new Pair<String, DataType>("coordinates.coordinates[1]", DataType.FLOAT8)); + } + + @After + public void cleanup() throws Exception { + columnDefs.clear(); + extraParams.clear(); + } + + @Test + public void testCompressedMultilineJsonFile() throws Exception { + + extraParams.add(new Pair<String, String>("IDENTIFIER", "created_at")); + + List output = new ArrayList(); + + output.add("Fri Jun 07 22:45:02 + 2013,343136547115253761,text1,SpreadButter,tweetCongress,,"); + output.add("Fri Jun 07 22:45:02 + 2013,343136547123646465,text2,patronusdeadly,,,"); + output.add("Fri Jun 07 22:45:02 + 2013,343136547136233472,text3,NoSecrets_Vagas,,,"); + + super.assertOutput(new Path(System.getProperty("user.dir") + File.separator + + "src/test/resources/tweets.tar.gz"), output); + } + + @Test + public void testMaxRecordLength() throws Exception { + + // variable-size-objects.json contains 3 json objects but only 2 of them fit in the 27 byte length limitation + + extraParams.add(new Pair<String, String>("IDENTIFIER", "key666")); + extraParams.add(new Pair<String, String>("MAXLENGTH", "27")); + + columnDefs.clear(); + columnDefs.add(new Pair<String, DataType>("key666", DataType.TEXT)); + + List output = new ArrayList(); + + output.add("small object1"); + // skip the large object2 + output.add("small object3"); + + super.assertOutput(new Path(System.getProperty("user.dir") + File.separator
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/302#discussion_r51835237 --- Diff: pxf/pxf-json/src/test/java/org/apache/hawq/pxf/plugins/json/JsonExtensionTest.java --- @@ -0,0 +1,288 @@ +package org.apache.hawq.pxf.plugins.json; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.io.File; +import java.util.ArrayList; +import java.util.List; + +import org.apache.hadoop.fs.Path; +import org.apache.hawq.pxf.api.Fragmenter; +import org.apache.hawq.pxf.api.ReadAccessor; +import org.apache.hawq.pxf.api.ReadResolver; +import org.apache.hawq.pxf.api.io.DataType; +import org.apache.hawq.pxf.plugins.hdfs.HdfsDataFragmenter; +import org.apache.hawq.pxf.plugins.json.JsonAccessor; +import org.apache.hawq.pxf.plugins.json.JsonResolver; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +public class JsonExtensionTest extends PxfUnit { + + private List<Pair<String, DataType>> columnDefs = null; + private List<Pair<String, String>> extraParams = new ArrayList<Pair<String, String>>(); + + @Before + public void before() { + + columnDefs = new ArrayList<Pair<String, DataType>>(); + + columnDefs.add(new Pair<String, DataType>("created_at", DataType.TEXT)); + columnDefs.add(new Pair<String, DataType>("id", DataType.BIGINT)); + columnDefs.add(new Pair<String, DataType>("text", DataType.TEXT)); + columnDefs.add(new Pair<String, DataType>("user.screen_name", DataType.TEXT)); + columnDefs.add(new Pair<String, DataType>("entities.hashtags[0]", DataType.TEXT)); + columnDefs.add(new Pair<String, DataType>("coordinates.coordinates[0]", DataType.FLOAT8)); + columnDefs.add(new Pair<String, DataType>("coordinates.coordinates[1]", DataType.FLOAT8)); + } + + @After + public void cleanup() throws Exception { + columnDefs.clear(); + extraParams.clear(); + } + + @Test + public void testCompressedMultilineJsonFile() throws Exception { + + extraParams.add(new Pair<String, String>("IDENTIFIER", "created_at")); + + List output = new ArrayList(); + + output.add("Fri Jun 07 22:45:02 + 2013,343136547115253761,text1,SpreadButter,tweetCongress,,"); + output.add("Fri Jun 07 22:45:02 + 2013,343136547123646465,text2,patronusdeadly,,,"); + output.add("Fri Jun 07 22:45:02 + 2013,343136547136233472,text3,NoSecrets_Vagas,,,"); + + super.assertOutput(new Path(System.getProperty("user.dir") + File.separator + + "src/test/resources/tweets.tar.gz"), output); + } + + @Test + public void testMaxRecordLength() throws Exception { + + // variable-size-objects.json contains 3 json objects but only 2 of them fit in the 27 byte length limitation + + extraParams.add(new Pair<String, String>("IDENTIFIER", "key666")); + extraParams.add(new Pair<String, String>("MAXLENGTH", "27")); + + columnDefs.clear(); --- End diff -- I think the clear is not necessary since it's also done in the @After function --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/302#discussion_r51835736 --- Diff: pxf/pxf-json/src/test/java/org/apache/hawq/pxf/plugins/json/PxfUnit.java --- @@ -0,0 +1,672 @@ +package org.apache.hawq.pxf.plugins.json; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.io.BufferedReader; +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.io.InputStreamReader; +import java.io.OutputStream; +import java.lang.reflect.Constructor; +import java.security.InvalidParameterException; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; + +import org.apache.commons.logging.Log; +import org.apache.commons.logging.LogFactory; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hawq.pxf.api.Fragment; +import org.apache.hawq.pxf.api.Fragmenter; +import org.apache.hawq.pxf.api.OneField; +import org.apache.hawq.pxf.api.OneRow; +import org.apache.hawq.pxf.api.ReadAccessor; +import org.apache.hawq.pxf.api.ReadResolver; +import org.apache.hawq.pxf.api.WriteAccessor; +import org.apache.hawq.pxf.api.WriteResolver; +import org.apache.hawq.pxf.api.io.DataType; +import org.apache.hawq.pxf.api.utilities.InputData; +import org.apache.hawq.pxf.service.FragmentsResponse; +import org.apache.hawq.pxf.service.FragmentsResponseFormatter; +import org.apache.hawq.pxf.service.utilities.ProtocolData; +import org.codehaus.jackson.JsonFactory; +import org.codehaus.jackson.JsonNode; +import org.codehaus.jackson.JsonParseException; +import org.codehaus.jackson.map.JsonMappingException; +import org.codehaus.jackson.map.ObjectMapper; +import org.junit.Assert; + +/** + * This abstract class contains a number of helpful utilities in developing a PXF extension for HAWQ. Extend this class + * and use the various assert methods to check given input against known output. + */ +public abstract class PxfUnit { + + private static final Log LOG = LogFactory.getLog(PxfUnit.class); + + private static JsonFactory factory = new JsonFactory(); + private static ObjectMapper mapper = new ObjectMapper(factory); + + protected static List inputs = null; + + /** +* Uses the given input directory to run through the PXF unit testing framework. Uses the lines in the file for +* output testing. +* +* @param input +*Input records +* @param expectedOutput +*File containing output to check +* @throws Exception +*/ + public void assertOutput(Path input, Path expectedOutput) throws Exception { + + BufferedReader rdr = new BufferedReader(new InputStreamReader(FileSystem.get(new Configuration()).open( + expectedOutput))); + + List outputLines = new ArrayList(); + + String line; + while ((line = rdr.readLine()) != null) { + outputLines.add(line); + } + + assertOutput(input, outputLines); + + rdr.close(); + } + + /** +* Uses the given input directory to run through the PXF unit testing framework. Uses the lines in the given +* parameter for output testing. +* +* @param input +*Input records +* @param expectedOutput +*File containing output to check +* @throws Exception +*/ + public void assertOutput(Path input, List expectedOutput) throws Exception { + + setup(input); + List actualOutput = new ArrayList(); + for (InputData data : inputs) { + R
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/302#discussion_r51836238 --- Diff: pxf/pxf-json/src/test/java/org/apache/hawq/pxf/plugins/json/parser/PartitionedJsonParserOffsetTest.java --- @@ -0,0 +1,54 @@ +package org.apache.hawq.pxf.plugins.json.parser; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotNull; + +import java.io.ByteArrayInputStream; +import java.io.IOException; +import java.io.InputStream; + +import org.junit.Test; + +public class PartitionedJsonParserOffsetTest { + /* +* [{"color": "red","v": "vv"},{"color": "red","v": "vv"}] +*/ + public static String json2 = "[{\"color\": \"red\",\"v\": \"vv\"},{\"color\": \"red\",\"v\": \"vv\"}]"; + + @Test + public void testOffset() throws IOException { + InputStream jsonInputStream = createFromString(json2); --- End diff -- close stream --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/302#discussion_r51827170 --- Diff: pxf/pxf-json/src/main/java/org/apache/hawq/pxf/plugins/json/JsonRecordReader.java --- @@ -0,0 +1,176 @@ +package org.apache.hawq.pxf.plugins.json; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.io.IOException; +import java.io.InputStream; + +import org.apache.commons.logging.Log; +import org.apache.commons.logging.LogFactory; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.io.LongWritable; +import org.apache.hadoop.io.Text; +import org.apache.hadoop.io.compress.CompressionCodec; +import org.apache.hadoop.io.compress.CompressionCodecFactory; +import org.apache.hadoop.mapred.FileSplit; +import org.apache.hadoop.mapred.JobConf; +import org.apache.hadoop.mapred.RecordReader; +import org.apache.hawq.pxf.plugins.json.parser.PartitionedJsonParser; + +/** + * Multi-line json object reader. JsonRecordReader uses a member name (set by the IDENTIFIER PXF parameter) to + * determine the encapsulating object to extract and read. + * + * JsonRecordReader supports compressed input files as well. + * + * As a safe guard set the optional MAXLENGTH parameter to limit the max size of a record. + */ +public class JsonRecordReader implements RecordReader<LongWritable, Text> { + + private static final Log LOG = LogFactory.getLog(JsonRecordReader.class); + + public static final String RECORD_MEMBER_IDENTIFIER = "json.input.format.record.identifier"; + public static final String RECORD_MAX_LENGTH = "multilinejsonrecordreader.maxlength"; + + private CompressionCodecFactory compressionCodecs = null; + private long start; + private long pos; + private long end; + private int maxObjectLength; + private InputStream is; + private PartitionedJsonParser parser; + private final String jsonMemberName; + + /** +* Create new multi-line json object reader. +* +* @param conf +*Hadoop context +* @param split +*HDFS split to start the reading from +* @throws IOException +*/ + public JsonRecordReader(JobConf conf, FileSplit split) throws IOException { + + this.jsonMemberName = conf.get(RECORD_MEMBER_IDENTIFIER); + this.maxObjectLength = conf.getInt(RECORD_MAX_LENGTH, Integer.MAX_VALUE); + + start = split.getStart(); + end = start + split.getLength(); + final Path file = split.getPath(); + compressionCodecs = new CompressionCodecFactory(conf); + final CompressionCodec codec = compressionCodecs.getCodec(file); --- End diff -- Regarding compression, we support it for hdfs files, and the implementation here looks very similar to the one in LineRecordReader. The issue worth mentioning here is that BZip2 Codec is not thread safe. For map-reduce jobs it doesn't matter, because each job is running in its own JVM, but we are handling multiple concurrent requests through tomcat, so that introduced a problem. For Hdfs files, we "solved" it by forcing the processing of bzip2 files to be done in a single thread. See HdfsUtilities.isThreadSafe() which is called by HdfsSplittableDataAccessor. I see that the JsonAccessor inherits from this class, so I think it should be fine, but it's worth verifying. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/302#discussion_r51828041 --- Diff: pxf/pxf-json/src/main/java/org/apache/hawq/pxf/plugins/json/JsonResolver.java --- @@ -0,0 +1,255 @@ +package org.apache.hawq.pxf.plugins.json; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Iterator; +import java.util.List; + +import org.apache.commons.logging.Log; +import org.apache.commons.logging.LogFactory; +import org.apache.hawq.pxf.api.OneField; +import org.apache.hawq.pxf.api.OneRow; +import org.apache.hawq.pxf.api.ReadResolver; +import org.apache.hawq.pxf.api.io.DataType; +import org.apache.hawq.pxf.api.utilities.ColumnDescriptor; +import org.apache.hawq.pxf.api.utilities.InputData; +import org.apache.hawq.pxf.api.utilities.Plugin; +import org.codehaus.jackson.JsonFactory; +import org.codehaus.jackson.JsonNode; +import org.codehaus.jackson.map.ObjectMapper; + +/** + * This JSON resolver for PXF will decode a given object from the {@link JsonAccessor} into a row for HAWQ. It will + * decode this data into a JsonNode and walk the tree for each column. It supports normal value mapping via projections + * and JSON array indexing. + */ +public class JsonResolver extends Plugin implements ReadResolver { + + private static final Log LOG = LogFactory.getLog(JsonResolver.class); + + private ArrayList oneFieldList; + private ColumnDescriptorCache[] columnDescriptorCache; + private ObjectMapper mapper; + + public JsonResolver(InputData inputData) throws Exception { + super(inputData); + oneFieldList = new ArrayList(); + mapper = new ObjectMapper(new JsonFactory()); + + // Pre-generate all column structure attributes concerning the JSON to column value resolution. + columnDescriptorCache = new ColumnDescriptorCache[inputData.getColumns()]; + for (int i = 0; i < inputData.getColumns(); ++i) { + ColumnDescriptor cd = inputData.getColumn(i); + columnDescriptorCache[i] = new ColumnDescriptorCache(cd); + } + } + + @Override + public List getFields(OneRow row) throws Exception { + oneFieldList.clear(); + + String jsonRecordAsText = row.getData().toString(); + + JsonNode root = decodeLineToJsonNode(jsonRecordAsText); + + if (root == null) { + LOG.warn("Return null-fields row due to invalid JSON:" + jsonRecordAsText); --- End diff -- another possible improvement: create a final list of null fields for the case where the root is null, and immediately return that list here. That will save us the iteration over the columns and checking for each one if the root is null. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/302#discussion_r51828253 --- Diff: pxf/pxf-json/src/main/java/org/apache/hawq/pxf/plugins/json/JsonResolver.java --- @@ -0,0 +1,255 @@ +package org.apache.hawq.pxf.plugins.json; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Iterator; +import java.util.List; + +import org.apache.commons.logging.Log; +import org.apache.commons.logging.LogFactory; +import org.apache.hawq.pxf.api.OneField; +import org.apache.hawq.pxf.api.OneRow; +import org.apache.hawq.pxf.api.ReadResolver; +import org.apache.hawq.pxf.api.io.DataType; +import org.apache.hawq.pxf.api.utilities.ColumnDescriptor; +import org.apache.hawq.pxf.api.utilities.InputData; +import org.apache.hawq.pxf.api.utilities.Plugin; +import org.codehaus.jackson.JsonFactory; +import org.codehaus.jackson.JsonNode; +import org.codehaus.jackson.map.ObjectMapper; + +/** + * This JSON resolver for PXF will decode a given object from the {@link JsonAccessor} into a row for HAWQ. It will + * decode this data into a JsonNode and walk the tree for each column. It supports normal value mapping via projections + * and JSON array indexing. + */ +public class JsonResolver extends Plugin implements ReadResolver { + + private static final Log LOG = LogFactory.getLog(JsonResolver.class); + + private ArrayList oneFieldList; + private ColumnDescriptorCache[] columnDescriptorCache; + private ObjectMapper mapper; + + public JsonResolver(InputData inputData) throws Exception { + super(inputData); + oneFieldList = new ArrayList(); + mapper = new ObjectMapper(new JsonFactory()); + + // Pre-generate all column structure attributes concerning the JSON to column value resolution. + columnDescriptorCache = new ColumnDescriptorCache[inputData.getColumns()]; + for (int i = 0; i < inputData.getColumns(); ++i) { + ColumnDescriptor cd = inputData.getColumn(i); + columnDescriptorCache[i] = new ColumnDescriptorCache(cd); + } + } + + @Override + public List getFields(OneRow row) throws Exception { + oneFieldList.clear(); + + String jsonRecordAsText = row.getData().toString(); + + JsonNode root = decodeLineToJsonNode(jsonRecordAsText); + + if (root == null) { + LOG.warn("Return null-fields row due to invalid JSON:" + jsonRecordAsText); + } + + // Iterate through the column definition and fetch our JSON data + for (ColumnDescriptorCache column : columnDescriptorCache) { + + // Get the current column description + + if (root == null) { + // Return empty (e.g. null) filed in case of malformed json. + addNullField(column.getColumnType()); + } else { + + // Move down the JSON path to the final name + JsonNode node = getPriorJsonNode(root, column.getProjections()); + + // If this column is an array index, ex. "tweet.hashtags[0]" + if (column.isArrayName()) { + + // Move to the array node + node = node.get(column.getArrayNodeName()); + + // If this node is null or missing, add a null value here + if (node == null || node.isMissingNode()) { + addNullField(column.getColumnType()); +
[GitHub] incubator-hawq pull request: HAWQ-343. Core when setting enable_se...
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/328#discussion_r51771858 --- Diff: src/backend/cdb/cdbfilesystemcredential.c --- @@ -382,6 +382,17 @@ cleanup_filesystem_credentials(Portal portal) } else { + /** +* this case happened for the command "set enable_secure_filesystem to true;" +* enable_secure_filesystem is true but credentials did not allocated" --- End diff -- redundant " in the end of sentence. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-369. Hcatalog as reserved name.
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/324#discussion_r51762431 --- Diff: src/backend/utils/init/postinit.c --- @@ -368,6 +368,22 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username, chardbname[NAMEDATALEN]; /* +* User is not supposed to connect to hcatalog database, +* because it's reserved for hcatalog feature integration +*/ + if (!bootstrap) + { + if (strcmp(in_dbname, HcatalogDbName) == 0) + { + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_DATABASE), + errmsg("\"%s\" database is only for system use", --- End diff -- Fair enough. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-369. Hcatalog as reserved name.
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/324#discussion_r51658725 --- Diff: src/backend/commands/dbcommands.c --- @@ -1533,6 +1540,15 @@ RenameDatabase(const char *oldname, const char *newname) cqContext cqc; cqContext *pcqCtx; + + /* +* Make sure "hcatalog" is not used as new name, because it's reserved for +* hcatalog feature integration +*/ + if (strcmp(newname, HcatalogDbName) == 0) + ereport(ERROR, + (errcode(ERRCODE_RESERVED_NAME), + errmsg("\"%s\" is a reserved name for hcatalog feature integration", HcatalogDbName))); --- End diff -- same --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-369. Hcatalog as reserved name.
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/324#discussion_r51658990 --- Diff: src/backend/utils/init/postinit.c --- @@ -368,6 +368,22 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username, chardbname[NAMEDATALEN]; /* +* User is not supposed to connect to hcatalog database, +* because it's reserved for hcatalog feature integration +*/ + if (!bootstrap) + { + if (strcmp(in_dbname, HcatalogDbName) == 0) + { + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_DATABASE), + errmsg("\"%s\" database is only for system use", --- End diff -- perhaps add something like "cannot connect to database hcatalog. Database is only for system use" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-369. Hcatalog as reserved name.
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/324#discussion_r51658794 --- Diff: src/backend/utils/init/postinit.c --- @@ -368,6 +368,22 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username, chardbname[NAMEDATALEN]; /* +* User is not supposed to connect to hcatalog database, +* because it's reserved for hcatalog feature integration --- End diff -- same --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-369. Hcatalog as reserved name.
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/324#discussion_r51658617 --- Diff: src/backend/commands/dbcommands.c --- @@ -848,11 +848,18 @@ createdb(CreatedbStmt *stmt) * Check for db name conflict. This is just to give a more friendly error * message than "unique index violation". There's a race condition but * we're willing to accept the less friendly message in that case. +* Also check that user is not trying to use "hcatalog" as a database name, +* because it's already reserved for hcatalog feature integration. */ if (OidIsValid(get_database_oid(dbname))) - ereport(ERROR, - (errcode(ERRCODE_DUPLICATE_DATABASE), -errmsg("database \"%s\" already exists", dbname))); + if (strcmp(dbname, HcatalogDbName) == 0) + ereport(ERROR, + (errcode(ERRCODE_RESERVED_NAME), +errmsg("\"%s\" is a reserved name for hcatalog feature integration", HcatalogDbName))); --- End diff -- same comment about the message :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-369. Hcatalog as reserved name.
Github user hornn commented on the pull request: https://github.com/apache/incubator-hawq/pull/324#issuecomment-178903189 Amazing job! The only standing comment is about the error messages - we should decide if it's going to be "HCatalog integration feature" or "HCatalog feature integration" +100 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-369. Hcatalog as reserved name.
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/324#discussion_r51659023 --- Diff: src/backend/utils/init/postinit.c --- @@ -368,6 +368,22 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username, chardbname[NAMEDATALEN]; /* +* User is not supposed to connect to hcatalog database, +* because it's reserved for hcatalog feature integration +*/ + if (!bootstrap) + { + if (strcmp(in_dbname, HcatalogDbName) == 0) + { + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_DATABASE), --- End diff -- instead of undefined database, maybe use the reserved name error code? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-369. Hcatalog as reserved name.
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/324#discussion_r51659823 --- Diff: src/test/regress/input/hcatalog_lookup.source --- @@ -138,16 +138,49 @@ alter table test_schema.p exchange partition p1 with table hcatalog.test_schema. select pg_catalog.pg_database_size('hcatalog'); select pg_catalog.pg_database_size(6120); +--positive test: should be able to create table named "hcatalog" +CREATE TABLE hcatalog(a int); + +--negative test: cannot create database named "hcatalog" +CREATE DATABASE hcatalog; + +--allow renaming schemas and databases +SET gp_called_by_pgdump = true; + +--negative test: cannot rename existing database to "hcatalog" +ALTER DATABASE regression RENAME TO hcatalog; + +--positive test: can rename existing schema to "hcatalog" +CREATE SCHEMA test_schema3; +ALTER SCHEMA test_schema3 RENAME to hcatalog; + +--positive test: can rename schema "hcatalog" +ALTER SCHEMA hcatalog RENAME to hcatalog1; + +--positive test: should be able to create schema named "hcatalog" +CREATE SCHEMA hcatalog; + +--negative test: cannot create a database using "hcatalog" as a template +CREATE DATABASE hcatalog2 TEMPLATE hcatalog; + +--restrict renaming schemas and databases +SET gp_called_by_pgdump = false; + -- cleanup DROP schema test_schema cascade; SELECT convert_to_internal_schema('test_schema'); DROP schema test_schema cascade; DROP schema test_schema2 cascade; +DROP schema hcatalog1 cascade; --- End diff -- missing drop for hcatalog schema --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Github user hornn commented on the pull request: https://github.com/apache/incubator-hawq/pull/302#issuecomment-178685025 I agree. We can afford to wait a little longer for a better plugin. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request: HAWQ-369. Hcatalog as reserved name.
Github user hornn commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/324#discussion_r51658480 --- Diff: doc/src/sgml/ref/create_database.sgml --- @@ -184,6 +184,10 @@ CREATE DATABASE name connection slot remains for the database, it is possible that both will fail. Also, the limit is not enforced against superusers. + + + User can not create database named "hcatalog", because it's reserved for Hcatalog feature integration. --- End diff -- change to "HCatalog integration feature" to match comments above. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---