[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16925038#comment-16925038 ] ASF GitHub Bot commented on DRILL-7343: --- gparai commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Labels: doc-impacting, ready-to-commit > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923482#comment-16923482 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on issue #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#issuecomment-528391368 Looks good, +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923477#comment-16923477 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r321296574 ## File path: contrib/udfs/README.md ## @@ -0,0 +1,56 @@ +# Drill User Defined Functions + +This `README` documents functions which users have submitted to Apaceh Drill. + +## User Agent Functions +Drill UDF for parsing User Agent Strings. +This function is based on Niels Basjes Java library for parsing user agent strings which is available here: https://github.com/nielsbasjes/yauaa. + +### Usage +Using this function is fairly simple. The function `parse_user_agent()` takes a user agent string as an argument and returns a map of the available fields. Note that not every field will be present in every user agent string. +``` +SELECT parse_user_agent( columns[0] ) as ua +FROM dfs.`/Users/cgivre/drill-httpd/ua.csv`; +``` +The query above returns: +``` +{ + "DeviceClass":"Desktop", + "DeviceName":"Macintosh", + "DeviceBrand":"Apple", + "OperatingSystemClass":"Desktop", + "OperatingSystemName":"Mac OS X", + "OperatingSystemVersion":"10.10.1", + "OperatingSystemNameVersion":"Mac OS X 10.10.1", + "LayoutEngineClass":"Browser", + "LayoutEngineName":"Blink", + "LayoutEngineVersion":"39.0", + "LayoutEngineVersionMajor":"39", + "LayoutEngineNameVersion":"Blink 39.0", + "LayoutEngineNameVersionMajor":"Blink 39", + "AgentClass":"Browser", + "AgentName":"Chrome", + "AgentVersion":"39.0.2171.99", + "AgentVersionMajor":"39", + "AgentNameVersion":"Chrome 39.0.2171.99", + "AgentNameVersionMajor":"Chrome 39", + "DeviceCpu":"Intel" +} +``` +The function returns a Drill map, so you can access any of the fields using Drill's table.map.key notation. For example, the query below illustrates how to extract a field from this map and summarize it: + +``` +SELECT uadata.ua.AgentNameVersion AS Browser, +COUNT( * ) AS BrowserCount +FROM ( + SELECT parse_user_agent( columns[0] ) AS ua + FROM dfs.drillworkshop.`user-agents.csv` +) AS uadata +GROUP BY uadata.ua.AgentNameVersion +ORDER BY BrowserCount DESC +``` +The function can also be called with an optional field as an argument. IE: Review comment: @cgivre please fix this one, this the last one :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923420#comment-16923420 ] ASF GitHub Bot commented on DRILL-7343: --- cgivre commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r321247492 ## File path: contrib/udfs/src/main/java/org/apache/drill/exec/udfs/UserAgentFunctions.java ## @@ -0,0 +1,184 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import io.netty.buffer.DrillBuf; +import org.apache.drill.exec.expr.DrillSimpleFunc; +import org.apache.drill.exec.expr.annotations.FunctionTemplate; +import org.apache.drill.exec.expr.annotations.Output; +import org.apache.drill.exec.expr.annotations.Param; +import org.apache.drill.exec.expr.annotations.Workspace; +import org.apache.drill.exec.expr.holders.NullableVarCharHolder; +import org.apache.drill.exec.expr.holders.VarCharHolder; +import org.apache.drill.exec.vector.complex.writer.BaseWriter; + +import javax.inject.Inject; + +public class UserAgentFunctions { + + @FunctionTemplate(name = "parse_user_agent", +scope = FunctionTemplate.FunctionScope.SIMPLE + ) + public static class UserAgentFunction implements DrillSimpleFunc { +@Param +VarCharHolder input; + +@Output +BaseWriter.ComplexWriter outWriter; + +@Inject +DrillBuf outBuffer; + +@Workspace +nl.basjes.parse.useragent.UserAgentAnalyzerDirect uaa; + +public void setup() { + uaa = nl.basjes.parse.useragent.UserAgentAnalyzerDirect.newBuilder().dropTests().hideMatcherLoadStats().build(); + uaa.getAllPossibleFieldNamesSorted(); +} + +public void eval() { + org.apache.drill.exec.vector.complex.writer.BaseWriter.MapWriter queryMapWriter = outWriter.rootAsMap(); + + String userAgentString = org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.getStringFromVarCharHolder(input); + + if (userAgentString.isEmpty() || userAgentString.equals("null")) { Review comment: Fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923421#comment-16923421 ] ASF GitHub Bot commented on DRILL-7343: --- cgivre commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r321247553 ## File path: contrib/udfs/README.md ## @@ -0,0 +1,58 @@ +# Drill User Defined Functions + +This `README` documents functions which users have submitted to Apache Drill. + +## User Agent Functions +Drill UDF for parsing User Agent Strings. +This function is based on Niels Basjes Java library for parsing user agent strings which is available here: https://github.com/nielsbasjes/yauaa. + +### Usage +The function `parse_user_agent()` takes a user agent string as an argument and returns a map of the available fields. Note that not every field will be present in every user agent string. +``` +SELECT parse_user_agent( columns[0] ) as ua +FROM dfs.`/tmp/data/drill-httpd/ua.csv`; +``` +The query above returns: +``` +{ + "DeviceClass":"Desktop", + "DeviceName":"Macintosh", + "DeviceBrand":"Apple", + "OperatingSystemClass":"Desktop", + "OperatingSystemName":"Mac OS X", + "OperatingSystemVersion":"10.10.1", + "OperatingSystemNameVersion":"Mac OS X 10.10.1", + "LayoutEngineClass":"Browser", + "LayoutEngineName":"Blink", + "LayoutEngineVersion":"39.0", + "LayoutEngineVersionMajor":"39", + "LayoutEngineNameVersion":"Blink 39.0", + "LayoutEngineNameVersionMajor":"Blink 39", + "AgentClass":"Browser", + "AgentName":"Chrome", + "AgentVersion":"39.0.2171.99", + "AgentVersionMajor":"39", + "AgentNameVersion":"Chrome 39.0.2171.99", + "AgentNameVersionMajor":"Chrome 39", + "DeviceCpu":"Intel" +} +``` +The function returns a Drill map, so you can access any of the fields using Drill's table.map.key notation. For example, the query below illustrates how to extract a field from this map and summarize it: + +``` +SELECT uadata.ua.AgentNameVersion AS Browser, +COUNT( * ) AS BrowserCount +FROM ( + SELECT parse_user_agent( columns[0] ) AS ua + FROM dfs.drillworkshop.`user-agents.csv` +) AS uadata +GROUP BY uadata.ua.AgentNameVersion +ORDER BY BrowserCount DESC +``` +The function can also be called with an optional field as an argument. IE: +``` +SELECT parse_user_agent( `user_agent`, 'AgentName` ) as AgentName ... +``` +which will just return the requested field. If the user agent string is empty, all fields will have the value of `Hacker`. + +Note: This function does not accept `NULL` as input. Review comment: Removed and Fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923422#comment-16923422 ] ASF GitHub Bot commented on DRILL-7343: --- cgivre commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r321247660 ## File path: contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestUserAgentFunctions.java ## @@ -0,0 +1,170 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.categories.SqlFunctionTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterFixtureBuilder; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +import java.util.HashMap; + +@Category({UnlikelyTest.class, SqlFunctionTest.class}) +public class TestUserAgentFunctions extends ClusterTest { + + @BeforeClass + public static void setup() throws Exception { +ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher); +startCluster(builder); + } + + @Test + public void testParseUserAgentString() throws Exception { +String query = "SELECT t1.ua.DeviceClass AS DeviceClass,\n" + + "t1.ua.DeviceName AS DeviceName,\n" + + "t1.ua.DeviceBrand AS DeviceBrand,\n" + + "t1.ua.DeviceCpuBits AS DeviceCpuBits,\n" + + "t1.ua.OperatingSystemClass AS OperatingSystemClass,\n" + + "t1.ua.OperatingSystemName AS OperatingSystemName,\n" + + "t1.ua.OperatingSystemVersion AS OperatingSystemVersion,\n" + + "t1.ua.OperatingSystemVersionMajor AS OperatingSystemVersionMajor,\n" + + "t1.ua.OperatingSystemNameVersion AS OperatingSystemNameVersion,\n" + + "t1.ua.OperatingSystemNameVersionMajor AS OperatingSystemNameVersionMajor,\n" + + "t1.ua.LayoutEngineClass AS LayoutEngineClass,\n" + + "t1.ua.LayoutEngineName AS LayoutEngineName,\n" + + "t1.ua.LayoutEngineVersion AS LayoutEngineVersion,\n" + + "t1.ua.LayoutEngineVersionMajor AS LayoutEngineVersionMajor,\n" + + "t1.ua.LayoutEngineNameVersion AS LayoutEngineNameVersion,\n" + + "t1.ua.LayoutEngineBuild AS LayoutEngineBuild,\n" + + "t1.ua.AgentClass AS AgentClass,\n" + + "t1.ua.AgentName AS AgentName,\n" + + "t1.ua.AgentVersion AS AgentVersion,\n" + + "t1.ua.AgentVersionMajor AS AgentVersionMajor,\n" + + "t1.ua.AgentNameVersionMajor AS AgentNameVersionMajor,\n" + + "t1.ua.AgentLanguage AS AgentLanguage,\n" + + "t1.ua.AgentLanguageCode AS AgentLanguageCode,\n" + + "t1.ua.AgentSecurity AS AgentSecurity\n" + + "FROM (SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11') AS ua FROM (values(1))) AS t1"; + +testBuilder() + .sqlQuery(query) + .unOrdered() + .baselineColumns("DeviceClass", "DeviceName", "DeviceBrand", "DeviceCpuBits", "OperatingSystemClass", "OperatingSystemName", "OperatingSystemVersion", "OperatingSystemVersionMajor", "OperatingSystemNameVersion", "OperatingSystemNameVersionMajor", "LayoutEngineClass", "LayoutEngineName", "LayoutEngineVersion", "LayoutEngineVersionMajor", "LayoutEngineNameVersion", "LayoutEngineBuild", "AgentClass", "AgentName", "AgentVersion", "AgentVersionMajor", "AgentNameVersionMajor", "AgentLanguage", "AgentLanguageCode", "AgentSecurity") + .baselineValues("Desktop", "Desktop", "Unknown", "32", "Desktop", "Windows NT", "XP", "XP", "Windows XP", "Windows XP", "Browser", "Gecko", "1.8.1.11", "1", "Gecko 1.8.1.11", "20071127", "Browser", "Firefox", "2.0.0.11", "2", "Firefox 2", "English (United States)", "en-us", "Strong security") + .go(); + } + + @Test + public void testGetHostName() throws Exception { +String query = "SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11', 'AgentSecurity') AS agent FROM " + + "(values(1))"; +testBuilder() +
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923387#comment-16923387 ] ASF GitHub Bot commented on DRILL-7343: --- KazydubB commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r321233721 ## File path: contrib/udfs/src/main/java/org/apache/drill/exec/udfs/UserAgentFunctions.java ## @@ -0,0 +1,184 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import io.netty.buffer.DrillBuf; +import org.apache.drill.exec.expr.DrillSimpleFunc; +import org.apache.drill.exec.expr.annotations.FunctionTemplate; +import org.apache.drill.exec.expr.annotations.Output; +import org.apache.drill.exec.expr.annotations.Param; +import org.apache.drill.exec.expr.annotations.Workspace; +import org.apache.drill.exec.expr.holders.NullableVarCharHolder; +import org.apache.drill.exec.expr.holders.VarCharHolder; +import org.apache.drill.exec.vector.complex.writer.BaseWriter; + +import javax.inject.Inject; + +public class UserAgentFunctions { + + @FunctionTemplate(name = "parse_user_agent", +scope = FunctionTemplate.FunctionScope.SIMPLE + ) + public static class UserAgentFunction implements DrillSimpleFunc { +@Param +VarCharHolder input; + +@Output +BaseWriter.ComplexWriter outWriter; + +@Inject +DrillBuf outBuffer; + +@Workspace +nl.basjes.parse.useragent.UserAgentAnalyzerDirect uaa; + +public void setup() { + uaa = nl.basjes.parse.useragent.UserAgentAnalyzerDirect.newBuilder().dropTests().hideMatcherLoadStats().build(); + uaa.getAllPossibleFieldNamesSorted(); +} + +public void eval() { + org.apache.drill.exec.vector.complex.writer.BaseWriter.MapWriter queryMapWriter = outWriter.rootAsMap(); + + String userAgentString = org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.getStringFromVarCharHolder(input); + + if (userAgentString.isEmpty() || userAgentString.equals("null")) { Review comment: The `userAgentString.isEmpty()` check can be dropped (here and below). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923386#comment-16923386 ] ASF GitHub Bot commented on DRILL-7343: --- KazydubB commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r321231642 ## File path: contrib/udfs/README.md ## @@ -0,0 +1,58 @@ +# Drill User Defined Functions + +This `README` documents functions which users have submitted to Apache Drill. + +## User Agent Functions +Drill UDF for parsing User Agent Strings. +This function is based on Niels Basjes Java library for parsing user agent strings which is available here: https://github.com/nielsbasjes/yauaa. + +### Usage +The function `parse_user_agent()` takes a user agent string as an argument and returns a map of the available fields. Note that not every field will be present in every user agent string. +``` +SELECT parse_user_agent( columns[0] ) as ua +FROM dfs.`/tmp/data/drill-httpd/ua.csv`; +``` +The query above returns: +``` +{ + "DeviceClass":"Desktop", + "DeviceName":"Macintosh", + "DeviceBrand":"Apple", + "OperatingSystemClass":"Desktop", + "OperatingSystemName":"Mac OS X", + "OperatingSystemVersion":"10.10.1", + "OperatingSystemNameVersion":"Mac OS X 10.10.1", + "LayoutEngineClass":"Browser", + "LayoutEngineName":"Blink", + "LayoutEngineVersion":"39.0", + "LayoutEngineVersionMajor":"39", + "LayoutEngineNameVersion":"Blink 39.0", + "LayoutEngineNameVersionMajor":"Blink 39", + "AgentClass":"Browser", + "AgentName":"Chrome", + "AgentVersion":"39.0.2171.99", + "AgentVersionMajor":"39", + "AgentNameVersion":"Chrome 39.0.2171.99", + "AgentNameVersionMajor":"Chrome 39", + "DeviceCpu":"Intel" +} +``` +The function returns a Drill map, so you can access any of the fields using Drill's table.map.key notation. For example, the query below illustrates how to extract a field from this map and summarize it: + +``` +SELECT uadata.ua.AgentNameVersion AS Browser, +COUNT( * ) AS BrowserCount +FROM ( + SELECT parse_user_agent( columns[0] ) AS ua + FROM dfs.drillworkshop.`user-agents.csv` +) AS uadata +GROUP BY uadata.ua.AgentNameVersion +ORDER BY BrowserCount DESC +``` +The function can also be called with an optional field as an argument. IE: +``` +SELECT parse_user_agent( `user_agent`, 'AgentName` ) as AgentName ... +``` +which will just return the requested field. If the user agent string is empty, all fields will have the value of `Hacker`. + +Note: This function does not accept `NULL` as input. Review comment: I believe, this line may be removed? :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923369#comment-16923369 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r321228335 ## File path: contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestUserAgentFunctions.java ## @@ -0,0 +1,170 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.categories.SqlFunctionTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterFixtureBuilder; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +import java.util.HashMap; + +@Category({UnlikelyTest.class, SqlFunctionTest.class}) +public class TestUserAgentFunctions extends ClusterTest { + + @BeforeClass + public static void setup() throws Exception { +ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher); +startCluster(builder); + } + + @Test + public void testParseUserAgentString() throws Exception { +String query = "SELECT t1.ua.DeviceClass AS DeviceClass,\n" + + "t1.ua.DeviceName AS DeviceName,\n" + + "t1.ua.DeviceBrand AS DeviceBrand,\n" + + "t1.ua.DeviceCpuBits AS DeviceCpuBits,\n" + + "t1.ua.OperatingSystemClass AS OperatingSystemClass,\n" + + "t1.ua.OperatingSystemName AS OperatingSystemName,\n" + + "t1.ua.OperatingSystemVersion AS OperatingSystemVersion,\n" + + "t1.ua.OperatingSystemVersionMajor AS OperatingSystemVersionMajor,\n" + + "t1.ua.OperatingSystemNameVersion AS OperatingSystemNameVersion,\n" + + "t1.ua.OperatingSystemNameVersionMajor AS OperatingSystemNameVersionMajor,\n" + + "t1.ua.LayoutEngineClass AS LayoutEngineClass,\n" + + "t1.ua.LayoutEngineName AS LayoutEngineName,\n" + + "t1.ua.LayoutEngineVersion AS LayoutEngineVersion,\n" + + "t1.ua.LayoutEngineVersionMajor AS LayoutEngineVersionMajor,\n" + + "t1.ua.LayoutEngineNameVersion AS LayoutEngineNameVersion,\n" + + "t1.ua.LayoutEngineBuild AS LayoutEngineBuild,\n" + + "t1.ua.AgentClass AS AgentClass,\n" + + "t1.ua.AgentName AS AgentName,\n" + + "t1.ua.AgentVersion AS AgentVersion,\n" + + "t1.ua.AgentVersionMajor AS AgentVersionMajor,\n" + + "t1.ua.AgentNameVersionMajor AS AgentNameVersionMajor,\n" + + "t1.ua.AgentLanguage AS AgentLanguage,\n" + + "t1.ua.AgentLanguageCode AS AgentLanguageCode,\n" + + "t1.ua.AgentSecurity AS AgentSecurity\n" + + "FROM (SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11') AS ua FROM (values(1))) AS t1"; + +testBuilder() + .sqlQuery(query) + .unOrdered() + .baselineColumns("DeviceClass", "DeviceName", "DeviceBrand", "DeviceCpuBits", "OperatingSystemClass", "OperatingSystemName", "OperatingSystemVersion", "OperatingSystemVersionMajor", "OperatingSystemNameVersion", "OperatingSystemNameVersionMajor", "LayoutEngineClass", "LayoutEngineName", "LayoutEngineVersion", "LayoutEngineVersionMajor", "LayoutEngineNameVersion", "LayoutEngineBuild", "AgentClass", "AgentName", "AgentVersion", "AgentVersionMajor", "AgentNameVersionMajor", "AgentLanguage", "AgentLanguageCode", "AgentSecurity") + .baselineValues("Desktop", "Desktop", "Unknown", "32", "Desktop", "Windows NT", "XP", "XP", "Windows XP", "Windows XP", "Browser", "Gecko", "1.8.1.11", "1", "Gecko 1.8.1.11", "20071127", "Browser", "Firefox", "2.0.0.11", "2", "Firefox 2", "English (United States)", "en-us", "Strong security") + .go(); + } + + @Test + public void testGetHostName() throws Exception { +String query = "SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11', 'AgentSecurity') AS agent FROM " + + "(values(1))"; +testBuilder() +
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923367#comment-16923367 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r321228229 ## File path: contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestUserAgentFunctions.java ## @@ -0,0 +1,170 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.categories.SqlFunctionTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterFixtureBuilder; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +import java.util.HashMap; + +@Category({UnlikelyTest.class, SqlFunctionTest.class}) +public class TestUserAgentFunctions extends ClusterTest { + + @BeforeClass + public static void setup() throws Exception { +ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher); +startCluster(builder); + } + + @Test + public void testParseUserAgentString() throws Exception { +String query = "SELECT t1.ua.DeviceClass AS DeviceClass,\n" + + "t1.ua.DeviceName AS DeviceName,\n" + + "t1.ua.DeviceBrand AS DeviceBrand,\n" + + "t1.ua.DeviceCpuBits AS DeviceCpuBits,\n" + + "t1.ua.OperatingSystemClass AS OperatingSystemClass,\n" + + "t1.ua.OperatingSystemName AS OperatingSystemName,\n" + + "t1.ua.OperatingSystemVersion AS OperatingSystemVersion,\n" + + "t1.ua.OperatingSystemVersionMajor AS OperatingSystemVersionMajor,\n" + + "t1.ua.OperatingSystemNameVersion AS OperatingSystemNameVersion,\n" + + "t1.ua.OperatingSystemNameVersionMajor AS OperatingSystemNameVersionMajor,\n" + + "t1.ua.LayoutEngineClass AS LayoutEngineClass,\n" + + "t1.ua.LayoutEngineName AS LayoutEngineName,\n" + + "t1.ua.LayoutEngineVersion AS LayoutEngineVersion,\n" + + "t1.ua.LayoutEngineVersionMajor AS LayoutEngineVersionMajor,\n" + + "t1.ua.LayoutEngineNameVersion AS LayoutEngineNameVersion,\n" + + "t1.ua.LayoutEngineBuild AS LayoutEngineBuild,\n" + + "t1.ua.AgentClass AS AgentClass,\n" + + "t1.ua.AgentName AS AgentName,\n" + + "t1.ua.AgentVersion AS AgentVersion,\n" + + "t1.ua.AgentVersionMajor AS AgentVersionMajor,\n" + + "t1.ua.AgentNameVersionMajor AS AgentNameVersionMajor,\n" + + "t1.ua.AgentLanguage AS AgentLanguage,\n" + + "t1.ua.AgentLanguageCode AS AgentLanguageCode,\n" + + "t1.ua.AgentSecurity AS AgentSecurity\n" + + "FROM (SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11') AS ua FROM (values(1))) AS t1"; + +testBuilder() + .sqlQuery(query) + .unOrdered() + .baselineColumns("DeviceClass", "DeviceName", "DeviceBrand", "DeviceCpuBits", "OperatingSystemClass", "OperatingSystemName", "OperatingSystemVersion", "OperatingSystemVersionMajor", "OperatingSystemNameVersion", "OperatingSystemNameVersionMajor", "LayoutEngineClass", "LayoutEngineName", "LayoutEngineVersion", "LayoutEngineVersionMajor", "LayoutEngineNameVersion", "LayoutEngineBuild", "AgentClass", "AgentName", "AgentVersion", "AgentVersionMajor", "AgentNameVersionMajor", "AgentLanguage", "AgentLanguageCode", "AgentSecurity") + .baselineValues("Desktop", "Desktop", "Unknown", "32", "Desktop", "Windows NT", "XP", "XP", "Windows XP", "Windows XP", "Browser", "Gecko", "1.8.1.11", "1", "Gecko 1.8.1.11", "20071127", "Browser", "Firefox", "2.0.0.11", "2", "Firefox 2", "English (United States)", "en-us", "Strong security") + .go(); + } + + @Test + public void testGetHostName() throws Exception { +String query = "SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11', 'AgentSecurity') AS agent FROM " + + "(values(1))"; +testBuilder() +
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923368#comment-16923368 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r321226710 ## File path: contrib/udfs/README.md ## @@ -0,0 +1,58 @@ +# Drill User Defined Functions + +This `README` documents functions which users have submitted to Apache Drill. + +## User Agent Functions +Drill UDF for parsing User Agent Strings. +This function is based on Niels Basjes Java library for parsing user agent strings which is available here: https://github.com/nielsbasjes/yauaa. + +### Usage +The function `parse_user_agent()` takes a user agent string as an argument and returns a map of the available fields. Note that not every field will be present in every user agent string. Review comment: ```suggestion The function `parse_user_agent()` takes a user agent string as an argument and returns a map of the available fields. Note that not every field will be present in every user agent string. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923366#comment-16923366 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r321228185 ## File path: contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestUserAgentFunctions.java ## @@ -0,0 +1,170 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.categories.SqlFunctionTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterFixtureBuilder; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +import java.util.HashMap; + +@Category({UnlikelyTest.class, SqlFunctionTest.class}) +public class TestUserAgentFunctions extends ClusterTest { + + @BeforeClass + public static void setup() throws Exception { +ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher); +startCluster(builder); + } + + @Test + public void testParseUserAgentString() throws Exception { +String query = "SELECT t1.ua.DeviceClass AS DeviceClass,\n" + + "t1.ua.DeviceName AS DeviceName,\n" + + "t1.ua.DeviceBrand AS DeviceBrand,\n" + + "t1.ua.DeviceCpuBits AS DeviceCpuBits,\n" + + "t1.ua.OperatingSystemClass AS OperatingSystemClass,\n" + + "t1.ua.OperatingSystemName AS OperatingSystemName,\n" + + "t1.ua.OperatingSystemVersion AS OperatingSystemVersion,\n" + + "t1.ua.OperatingSystemVersionMajor AS OperatingSystemVersionMajor,\n" + + "t1.ua.OperatingSystemNameVersion AS OperatingSystemNameVersion,\n" + + "t1.ua.OperatingSystemNameVersionMajor AS OperatingSystemNameVersionMajor,\n" + + "t1.ua.LayoutEngineClass AS LayoutEngineClass,\n" + + "t1.ua.LayoutEngineName AS LayoutEngineName,\n" + + "t1.ua.LayoutEngineVersion AS LayoutEngineVersion,\n" + + "t1.ua.LayoutEngineVersionMajor AS LayoutEngineVersionMajor,\n" + + "t1.ua.LayoutEngineNameVersion AS LayoutEngineNameVersion,\n" + + "t1.ua.LayoutEngineBuild AS LayoutEngineBuild,\n" + + "t1.ua.AgentClass AS AgentClass,\n" + + "t1.ua.AgentName AS AgentName,\n" + + "t1.ua.AgentVersion AS AgentVersion,\n" + + "t1.ua.AgentVersionMajor AS AgentVersionMajor,\n" + + "t1.ua.AgentNameVersionMajor AS AgentNameVersionMajor,\n" + + "t1.ua.AgentLanguage AS AgentLanguage,\n" + + "t1.ua.AgentLanguageCode AS AgentLanguageCode,\n" + + "t1.ua.AgentSecurity AS AgentSecurity\n" + + "FROM (SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11') AS ua FROM (values(1))) AS t1"; + +testBuilder() + .sqlQuery(query) + .unOrdered() + .baselineColumns("DeviceClass", "DeviceName", "DeviceBrand", "DeviceCpuBits", "OperatingSystemClass", "OperatingSystemName", "OperatingSystemVersion", "OperatingSystemVersionMajor", "OperatingSystemNameVersion", "OperatingSystemNameVersionMajor", "LayoutEngineClass", "LayoutEngineName", "LayoutEngineVersion", "LayoutEngineVersionMajor", "LayoutEngineNameVersion", "LayoutEngineBuild", "AgentClass", "AgentName", "AgentVersion", "AgentVersionMajor", "AgentNameVersionMajor", "AgentLanguage", "AgentLanguageCode", "AgentSecurity") + .baselineValues("Desktop", "Desktop", "Unknown", "32", "Desktop", "Windows NT", "XP", "XP", "Windows XP", "Windows XP", "Browser", "Gecko", "1.8.1.11", "1", "Gecko 1.8.1.11", "20071127", "Browser", "Firefox", "2.0.0.11", "2", "Firefox 2", "English (United States)", "en-us", "Strong security") + .go(); + } + + @Test + public void testGetHostName() throws Exception { +String query = "SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11', 'AgentSecurity') AS agent FROM " + + "(values(1))"; +testBuilder() +
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923370#comment-16923370 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r321228379 ## File path: contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestUserAgentFunctions.java ## @@ -0,0 +1,170 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.categories.SqlFunctionTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterFixtureBuilder; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +import java.util.HashMap; + +@Category({UnlikelyTest.class, SqlFunctionTest.class}) +public class TestUserAgentFunctions extends ClusterTest { + + @BeforeClass + public static void setup() throws Exception { +ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher); +startCluster(builder); + } + + @Test + public void testParseUserAgentString() throws Exception { +String query = "SELECT t1.ua.DeviceClass AS DeviceClass,\n" + + "t1.ua.DeviceName AS DeviceName,\n" + + "t1.ua.DeviceBrand AS DeviceBrand,\n" + + "t1.ua.DeviceCpuBits AS DeviceCpuBits,\n" + + "t1.ua.OperatingSystemClass AS OperatingSystemClass,\n" + + "t1.ua.OperatingSystemName AS OperatingSystemName,\n" + + "t1.ua.OperatingSystemVersion AS OperatingSystemVersion,\n" + + "t1.ua.OperatingSystemVersionMajor AS OperatingSystemVersionMajor,\n" + + "t1.ua.OperatingSystemNameVersion AS OperatingSystemNameVersion,\n" + + "t1.ua.OperatingSystemNameVersionMajor AS OperatingSystemNameVersionMajor,\n" + + "t1.ua.LayoutEngineClass AS LayoutEngineClass,\n" + + "t1.ua.LayoutEngineName AS LayoutEngineName,\n" + + "t1.ua.LayoutEngineVersion AS LayoutEngineVersion,\n" + + "t1.ua.LayoutEngineVersionMajor AS LayoutEngineVersionMajor,\n" + + "t1.ua.LayoutEngineNameVersion AS LayoutEngineNameVersion,\n" + + "t1.ua.LayoutEngineBuild AS LayoutEngineBuild,\n" + + "t1.ua.AgentClass AS AgentClass,\n" + + "t1.ua.AgentName AS AgentName,\n" + + "t1.ua.AgentVersion AS AgentVersion,\n" + + "t1.ua.AgentVersionMajor AS AgentVersionMajor,\n" + + "t1.ua.AgentNameVersionMajor AS AgentNameVersionMajor,\n" + + "t1.ua.AgentLanguage AS AgentLanguage,\n" + + "t1.ua.AgentLanguageCode AS AgentLanguageCode,\n" + + "t1.ua.AgentSecurity AS AgentSecurity\n" + + "FROM (SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11') AS ua FROM (values(1))) AS t1"; + +testBuilder() + .sqlQuery(query) + .unOrdered() + .baselineColumns("DeviceClass", "DeviceName", "DeviceBrand", "DeviceCpuBits", "OperatingSystemClass", "OperatingSystemName", "OperatingSystemVersion", "OperatingSystemVersionMajor", "OperatingSystemNameVersion", "OperatingSystemNameVersionMajor", "LayoutEngineClass", "LayoutEngineName", "LayoutEngineVersion", "LayoutEngineVersionMajor", "LayoutEngineNameVersion", "LayoutEngineBuild", "AgentClass", "AgentName", "AgentVersion", "AgentVersionMajor", "AgentNameVersionMajor", "AgentLanguage", "AgentLanguageCode", "AgentSecurity") + .baselineValues("Desktop", "Desktop", "Unknown", "32", "Desktop", "Windows NT", "XP", "XP", "Windows XP", "Windows XP", "Browser", "Gecko", "1.8.1.11", "1", "Gecko 1.8.1.11", "20071127", "Browser", "Firefox", "2.0.0.11", "2", "Firefox 2", "English (United States)", "en-us", "Strong security") + .go(); + } + + @Test + public void testGetHostName() throws Exception { +String query = "SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11', 'AgentSecurity') AS agent FROM " + + "(values(1))"; +testBuilder() +
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923365#comment-16923365 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r321226651 ## File path: contrib/udfs/README.md ## @@ -0,0 +1,58 @@ +# Drill User Defined Functions + +This `README` documents functions which users have submitted to Apache Drill. + +## User Agent Functions +Drill UDF for parsing User Agent Strings. +This function is based on Niels Basjes Java library for parsing user agent strings which is available here: https://github.com/nielsbasjes/yauaa. + +### Usage +The function `parse_user_agent()` takes a user agent string as an argument and returns a map of the available fields. Note that not every field will be present in every user agent string. +``` +SELECT parse_user_agent( columns[0] ) as ua +FROM dfs.`/tmp/data/drill-httpd/ua.csv`; +``` +The query above returns: +``` +{ + "DeviceClass":"Desktop", + "DeviceName":"Macintosh", + "DeviceBrand":"Apple", + "OperatingSystemClass":"Desktop", + "OperatingSystemName":"Mac OS X", + "OperatingSystemVersion":"10.10.1", + "OperatingSystemNameVersion":"Mac OS X 10.10.1", + "LayoutEngineClass":"Browser", + "LayoutEngineName":"Blink", + "LayoutEngineVersion":"39.0", + "LayoutEngineVersionMajor":"39", + "LayoutEngineNameVersion":"Blink 39.0", + "LayoutEngineNameVersionMajor":"Blink 39", + "AgentClass":"Browser", + "AgentName":"Chrome", + "AgentVersion":"39.0.2171.99", + "AgentVersionMajor":"39", + "AgentNameVersion":"Chrome 39.0.2171.99", + "AgentNameVersionMajor":"Chrome 39", + "DeviceCpu":"Intel" +} +``` +The function returns a Drill map, so you can access any of the fields using Drill's table.map.key notation. For example, the query below illustrates how to extract a field from this map and summarize it: + +``` +SELECT uadata.ua.AgentNameVersion AS Browser, +COUNT( * ) AS BrowserCount +FROM ( + SELECT parse_user_agent( columns[0] ) AS ua + FROM dfs.drillworkshop.`user-agents.csv` +) AS uadata +GROUP BY uadata.ua.AgentNameVersion +ORDER BY BrowserCount DESC +``` +The function can also be called with an optional field as an argument. IE: +``` +SELECT parse_user_agent( `user_agent`, 'AgentName` ) as AgentName ... +``` +which will just return the requested field. If the user agent string is empty, all fields will have the value of `Hacker`. Review comment: ```suggestion which will just return the requested field. If the user agent string is empty, all fields will have the value of `Hacker`. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923371#comment-16923371 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r321226983 ## File path: contrib/udfs/README.md ## @@ -0,0 +1,58 @@ +# Drill User Defined Functions + +This `README` documents functions which users have submitted to Apache Drill. + +## User Agent Functions +Drill UDF for parsing User Agent Strings. +This function is based on Niels Basjes Java library for parsing user agent strings which is available here: https://github.com/nielsbasjes/yauaa. Review comment: Please make a link: https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet#links This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922644#comment-16922644 ] ASF GitHub Bot commented on DRILL-7343: --- cgivre commented on issue #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#issuecomment-527979825 @KazydubB I opened a JIRA for the null issues, but this PR should be ready to go. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922585#comment-16922585 ] ASF GitHub Bot commented on DRILL-7343: --- KazydubB commented on issue #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#issuecomment-527940960 @cgivre I agree that having two functions for `OPTIONAL` and `REQUIRED` cases is not the best experience, but returning an empty list on `NULL` is not enough, because there may be functions that will need to return a `MAP` for example. So, there's a need to provide `NullHandling` for `NULL` input for other cases too (`EMPTY_LIST_IF_NULL`, `EMPTY_MAP_IF_NULL` etc.). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922580#comment-16922580 ] ASF GitHub Bot commented on DRILL-7343: --- cgivre commented on issue #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#issuecomment-527935677 @KazydubB What happens in the `parse_query` function is that there is a duplicate function of the same name which allows null input which returns an empty list. I had tried this before, but apparently didn't use the correct parameters so it didn't work for me, BUT, now it does so yay! In any event, this seems like horrible design. I'm going to open a JIRA to create a new function handler which returns an empty list on null. That seems like a better approach than having to write 2 UDFS for every UDF with a complex writer output. I also added a series of unit tests to test this functionality. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922552#comment-16922552 ] ASF GitHub Bot commented on DRILL-7343: --- KazydubB commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r320787922 ## File path: contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestUserAgentFunctions.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.categories.SqlFunctionTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterFixtureBuilder; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +@Category({UnlikelyTest.class, SqlFunctionTest.class}) +public class TestUserAgentFunctions extends ClusterTest { + + @BeforeClass + public static void setup() throws Exception { +ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher); +startCluster(builder); + } + + @Test + public void testParseUserAgentString() throws Exception { +String query = "SELECT t1.ua.DeviceClass AS DeviceClass,\n" + + "t1.ua.DeviceName AS DeviceName,\n" + + "t1.ua.DeviceBrand AS DeviceBrand,\n" + + "t1.ua.DeviceCpuBits AS DeviceCpuBits,\n" + + "t1.ua.OperatingSystemClass AS OperatingSystemClass,\n" + + "t1.ua.OperatingSystemName AS OperatingSystemName,\n" + + "t1.ua.OperatingSystemVersion AS OperatingSystemVersion,\n" + + "t1.ua.OperatingSystemVersionMajor AS OperatingSystemVersionMajor,\n" + + "t1.ua.OperatingSystemNameVersion AS OperatingSystemNameVersion,\n" + + "t1.ua.OperatingSystemNameVersionMajor AS OperatingSystemNameVersionMajor,\n" + + "t1.ua.LayoutEngineClass AS LayoutEngineClass,\n" + + "t1.ua.LayoutEngineName AS LayoutEngineName,\n" + + "t1.ua.LayoutEngineVersion AS LayoutEngineVersion,\n" + + "t1.ua.LayoutEngineVersionMajor AS LayoutEngineVersionMajor,\n" + + "t1.ua.LayoutEngineNameVersion AS LayoutEngineNameVersion,\n" + + "t1.ua.LayoutEngineBuild AS LayoutEngineBuild,\n" + + "t1.ua.AgentClass AS AgentClass,\n" + + "t1.ua.AgentName AS AgentName,\n" + + "t1.ua.AgentVersion AS AgentVersion,\n" + + "t1.ua.AgentVersionMajor AS AgentVersionMajor,\n" + + "t1.ua.AgentNameVersionMajor AS AgentNameVersionMajor,\n" + + "t1.ua.AgentLanguage AS AgentLanguage,\n" + + "t1.ua.AgentLanguageCode AS AgentLanguageCode,\n" + + "t1.ua.AgentSecurity AS AgentSecurity\n" + + "FROM (SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11') AS ua FROM (values(1))) AS t1"; + +testBuilder().sqlQuery(query).unOrdered() + .baselineColumns("DeviceClass", "DeviceName", "DeviceBrand","DeviceCpuBits","OperatingSystemClass", "OperatingSystemName","OperatingSystemVersion", + "OperatingSystemVersionMajor","OperatingSystemNameVersion","OperatingSystemNameVersionMajor","LayoutEngineClass","LayoutEngineName","LayoutEngineVersion", + "LayoutEngineVersionMajor","LayoutEngineNameVersion","LayoutEngineBuild","AgentClass","AgentName","AgentVersion","AgentVersionMajor","AgentNameVersionMajor", +"AgentLanguage","AgentLanguageCode","AgentSecurity") + .baselineValues("Desktop","Desktop", "Unknown","32", "Desktop", "Windows NT", "XP", "XP", "Windows XP", "Windows XP", "Browser", "Gecko", "1.8.1.11", "1", "Gecko 1.8.1.11", +"20071127", "Browser", "Firefox", "2.0.0.11", "2", "Firefox 2", "English (United States)", "en-us","Strong security") + .go(); + } + + @Test + public void testGetHostName() throws Exception { +String query = "SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11', 'AgentSecurity') AS agent FROM " + + "(values" + + "(1))"; +
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922550#comment-16922550 ] ASF GitHub Bot commented on DRILL-7343: --- KazydubB commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r320787922 ## File path: contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestUserAgentFunctions.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.categories.SqlFunctionTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterFixtureBuilder; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +@Category({UnlikelyTest.class, SqlFunctionTest.class}) +public class TestUserAgentFunctions extends ClusterTest { + + @BeforeClass + public static void setup() throws Exception { +ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher); +startCluster(builder); + } + + @Test + public void testParseUserAgentString() throws Exception { +String query = "SELECT t1.ua.DeviceClass AS DeviceClass,\n" + + "t1.ua.DeviceName AS DeviceName,\n" + + "t1.ua.DeviceBrand AS DeviceBrand,\n" + + "t1.ua.DeviceCpuBits AS DeviceCpuBits,\n" + + "t1.ua.OperatingSystemClass AS OperatingSystemClass,\n" + + "t1.ua.OperatingSystemName AS OperatingSystemName,\n" + + "t1.ua.OperatingSystemVersion AS OperatingSystemVersion,\n" + + "t1.ua.OperatingSystemVersionMajor AS OperatingSystemVersionMajor,\n" + + "t1.ua.OperatingSystemNameVersion AS OperatingSystemNameVersion,\n" + + "t1.ua.OperatingSystemNameVersionMajor AS OperatingSystemNameVersionMajor,\n" + + "t1.ua.LayoutEngineClass AS LayoutEngineClass,\n" + + "t1.ua.LayoutEngineName AS LayoutEngineName,\n" + + "t1.ua.LayoutEngineVersion AS LayoutEngineVersion,\n" + + "t1.ua.LayoutEngineVersionMajor AS LayoutEngineVersionMajor,\n" + + "t1.ua.LayoutEngineNameVersion AS LayoutEngineNameVersion,\n" + + "t1.ua.LayoutEngineBuild AS LayoutEngineBuild,\n" + + "t1.ua.AgentClass AS AgentClass,\n" + + "t1.ua.AgentName AS AgentName,\n" + + "t1.ua.AgentVersion AS AgentVersion,\n" + + "t1.ua.AgentVersionMajor AS AgentVersionMajor,\n" + + "t1.ua.AgentNameVersionMajor AS AgentNameVersionMajor,\n" + + "t1.ua.AgentLanguage AS AgentLanguage,\n" + + "t1.ua.AgentLanguageCode AS AgentLanguageCode,\n" + + "t1.ua.AgentSecurity AS AgentSecurity\n" + + "FROM (SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11') AS ua FROM (values(1))) AS t1"; + +testBuilder().sqlQuery(query).unOrdered() + .baselineColumns("DeviceClass", "DeviceName", "DeviceBrand","DeviceCpuBits","OperatingSystemClass", "OperatingSystemName","OperatingSystemVersion", + "OperatingSystemVersionMajor","OperatingSystemNameVersion","OperatingSystemNameVersionMajor","LayoutEngineClass","LayoutEngineName","LayoutEngineVersion", + "LayoutEngineVersionMajor","LayoutEngineNameVersion","LayoutEngineBuild","AgentClass","AgentName","AgentVersion","AgentVersionMajor","AgentNameVersionMajor", +"AgentLanguage","AgentLanguageCode","AgentSecurity") + .baselineValues("Desktop","Desktop", "Unknown","32", "Desktop", "Windows NT", "XP", "XP", "Windows XP", "Windows XP", "Browser", "Gecko", "1.8.1.11", "1", "Gecko 1.8.1.11", +"20071127", "Browser", "Firefox", "2.0.0.11", "2", "Firefox 2", "English (United States)", "en-us","Strong security") + .go(); + } + + @Test + public void testGetHostName() throws Exception { +String query = "SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11', 'AgentSecurity') AS agent FROM " + + "(values" + + "(1))"; +
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922551#comment-16922551 ] ASF GitHub Bot commented on DRILL-7343: --- KazydubB commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r320787922 ## File path: contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestUserAgentFunctions.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.categories.SqlFunctionTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterFixtureBuilder; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +@Category({UnlikelyTest.class, SqlFunctionTest.class}) +public class TestUserAgentFunctions extends ClusterTest { + + @BeforeClass + public static void setup() throws Exception { +ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher); +startCluster(builder); + } + + @Test + public void testParseUserAgentString() throws Exception { +String query = "SELECT t1.ua.DeviceClass AS DeviceClass,\n" + + "t1.ua.DeviceName AS DeviceName,\n" + + "t1.ua.DeviceBrand AS DeviceBrand,\n" + + "t1.ua.DeviceCpuBits AS DeviceCpuBits,\n" + + "t1.ua.OperatingSystemClass AS OperatingSystemClass,\n" + + "t1.ua.OperatingSystemName AS OperatingSystemName,\n" + + "t1.ua.OperatingSystemVersion AS OperatingSystemVersion,\n" + + "t1.ua.OperatingSystemVersionMajor AS OperatingSystemVersionMajor,\n" + + "t1.ua.OperatingSystemNameVersion AS OperatingSystemNameVersion,\n" + + "t1.ua.OperatingSystemNameVersionMajor AS OperatingSystemNameVersionMajor,\n" + + "t1.ua.LayoutEngineClass AS LayoutEngineClass,\n" + + "t1.ua.LayoutEngineName AS LayoutEngineName,\n" + + "t1.ua.LayoutEngineVersion AS LayoutEngineVersion,\n" + + "t1.ua.LayoutEngineVersionMajor AS LayoutEngineVersionMajor,\n" + + "t1.ua.LayoutEngineNameVersion AS LayoutEngineNameVersion,\n" + + "t1.ua.LayoutEngineBuild AS LayoutEngineBuild,\n" + + "t1.ua.AgentClass AS AgentClass,\n" + + "t1.ua.AgentName AS AgentName,\n" + + "t1.ua.AgentVersion AS AgentVersion,\n" + + "t1.ua.AgentVersionMajor AS AgentVersionMajor,\n" + + "t1.ua.AgentNameVersionMajor AS AgentNameVersionMajor,\n" + + "t1.ua.AgentLanguage AS AgentLanguage,\n" + + "t1.ua.AgentLanguageCode AS AgentLanguageCode,\n" + + "t1.ua.AgentSecurity AS AgentSecurity\n" + + "FROM (SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11') AS ua FROM (values(1))) AS t1"; + +testBuilder().sqlQuery(query).unOrdered() + .baselineColumns("DeviceClass", "DeviceName", "DeviceBrand","DeviceCpuBits","OperatingSystemClass", "OperatingSystemName","OperatingSystemVersion", + "OperatingSystemVersionMajor","OperatingSystemNameVersion","OperatingSystemNameVersionMajor","LayoutEngineClass","LayoutEngineName","LayoutEngineVersion", + "LayoutEngineVersionMajor","LayoutEngineNameVersion","LayoutEngineBuild","AgentClass","AgentName","AgentVersion","AgentVersionMajor","AgentNameVersionMajor", +"AgentLanguage","AgentLanguageCode","AgentSecurity") + .baselineValues("Desktop","Desktop", "Unknown","32", "Desktop", "Windows NT", "XP", "XP", "Windows XP", "Windows XP", "Browser", "Gecko", "1.8.1.11", "1", "Gecko 1.8.1.11", +"20071127", "Browser", "Firefox", "2.0.0.11", "2", "Firefox 2", "English (United States)", "en-us","Strong security") + .go(); + } + + @Test + public void testGetHostName() throws Exception { +String query = "SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11', 'AgentSecurity') AS agent FROM " + + "(values" + + "(1))"; +
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922541#comment-16922541 ] ASF GitHub Bot commented on DRILL-7343: --- cgivre commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r320784690 ## File path: contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestUserAgentFunctions.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.categories.SqlFunctionTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterFixtureBuilder; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +@Category({UnlikelyTest.class, SqlFunctionTest.class}) +public class TestUserAgentFunctions extends ClusterTest { + + @BeforeClass + public static void setup() throws Exception { +ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher); +startCluster(builder); + } + + @Test + public void testParseUserAgentString() throws Exception { +String query = "SELECT t1.ua.DeviceClass AS DeviceClass,\n" + + "t1.ua.DeviceName AS DeviceName,\n" + + "t1.ua.DeviceBrand AS DeviceBrand,\n" + + "t1.ua.DeviceCpuBits AS DeviceCpuBits,\n" + + "t1.ua.OperatingSystemClass AS OperatingSystemClass,\n" + + "t1.ua.OperatingSystemName AS OperatingSystemName,\n" + + "t1.ua.OperatingSystemVersion AS OperatingSystemVersion,\n" + + "t1.ua.OperatingSystemVersionMajor AS OperatingSystemVersionMajor,\n" + + "t1.ua.OperatingSystemNameVersion AS OperatingSystemNameVersion,\n" + + "t1.ua.OperatingSystemNameVersionMajor AS OperatingSystemNameVersionMajor,\n" + + "t1.ua.LayoutEngineClass AS LayoutEngineClass,\n" + + "t1.ua.LayoutEngineName AS LayoutEngineName,\n" + + "t1.ua.LayoutEngineVersion AS LayoutEngineVersion,\n" + + "t1.ua.LayoutEngineVersionMajor AS LayoutEngineVersionMajor,\n" + + "t1.ua.LayoutEngineNameVersion AS LayoutEngineNameVersion,\n" + + "t1.ua.LayoutEngineBuild AS LayoutEngineBuild,\n" + + "t1.ua.AgentClass AS AgentClass,\n" + + "t1.ua.AgentName AS AgentName,\n" + + "t1.ua.AgentVersion AS AgentVersion,\n" + + "t1.ua.AgentVersionMajor AS AgentVersionMajor,\n" + + "t1.ua.AgentNameVersionMajor AS AgentNameVersionMajor,\n" + + "t1.ua.AgentLanguage AS AgentLanguage,\n" + + "t1.ua.AgentLanguageCode AS AgentLanguageCode,\n" + + "t1.ua.AgentSecurity AS AgentSecurity\n" + + "FROM (SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11') AS ua FROM (values(1))) AS t1"; + +testBuilder().sqlQuery(query).unOrdered() + .baselineColumns("DeviceClass", "DeviceName", "DeviceBrand","DeviceCpuBits","OperatingSystemClass", "OperatingSystemName","OperatingSystemVersion", + "OperatingSystemVersionMajor","OperatingSystemNameVersion","OperatingSystemNameVersionMajor","LayoutEngineClass","LayoutEngineName","LayoutEngineVersion", + "LayoutEngineVersionMajor","LayoutEngineNameVersion","LayoutEngineBuild","AgentClass","AgentName","AgentVersion","AgentVersionMajor","AgentNameVersionMajor", +"AgentLanguage","AgentLanguageCode","AgentSecurity") + .baselineValues("Desktop","Desktop", "Unknown","32", "Desktop", "Windows NT", "XP", "XP", "Windows XP", "Windows XP", "Browser", "Gecko", "1.8.1.11", "1", "Gecko 1.8.1.11", +"20071127", "Browser", "Firefox", "2.0.0.11", "2", "Firefox 2", "English (United States)", "en-us","Strong security") + .go(); + } + + @Test + public void testGetHostName() throws Exception { +String query = "SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11', 'AgentSecurity') AS agent FROM " + + "(values" + + "(1))"; +
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922538#comment-16922538 ] ASF GitHub Bot commented on DRILL-7343: --- cgivre commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r320782860 ## File path: contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestUserAgentFunctions.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.categories.SqlFunctionTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterFixtureBuilder; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +@Category({UnlikelyTest.class, SqlFunctionTest.class}) +public class TestUserAgentFunctions extends ClusterTest { + + @BeforeClass + public static void setup() throws Exception { +ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher); +startCluster(builder); + } + + @Test + public void testParseUserAgentString() throws Exception { +String query = "SELECT t1.ua.DeviceClass AS DeviceClass,\n" + + "t1.ua.DeviceName AS DeviceName,\n" + + "t1.ua.DeviceBrand AS DeviceBrand,\n" + + "t1.ua.DeviceCpuBits AS DeviceCpuBits,\n" + + "t1.ua.OperatingSystemClass AS OperatingSystemClass,\n" + + "t1.ua.OperatingSystemName AS OperatingSystemName,\n" + + "t1.ua.OperatingSystemVersion AS OperatingSystemVersion,\n" + + "t1.ua.OperatingSystemVersionMajor AS OperatingSystemVersionMajor,\n" + + "t1.ua.OperatingSystemNameVersion AS OperatingSystemNameVersion,\n" + + "t1.ua.OperatingSystemNameVersionMajor AS OperatingSystemNameVersionMajor,\n" + + "t1.ua.LayoutEngineClass AS LayoutEngineClass,\n" + + "t1.ua.LayoutEngineName AS LayoutEngineName,\n" + + "t1.ua.LayoutEngineVersion AS LayoutEngineVersion,\n" + + "t1.ua.LayoutEngineVersionMajor AS LayoutEngineVersionMajor,\n" + + "t1.ua.LayoutEngineNameVersion AS LayoutEngineNameVersion,\n" + + "t1.ua.LayoutEngineBuild AS LayoutEngineBuild,\n" + + "t1.ua.AgentClass AS AgentClass,\n" + + "t1.ua.AgentName AS AgentName,\n" + + "t1.ua.AgentVersion AS AgentVersion,\n" + + "t1.ua.AgentVersionMajor AS AgentVersionMajor,\n" + + "t1.ua.AgentNameVersionMajor AS AgentNameVersionMajor,\n" + + "t1.ua.AgentLanguage AS AgentLanguage,\n" + + "t1.ua.AgentLanguageCode AS AgentLanguageCode,\n" + + "t1.ua.AgentSecurity AS AgentSecurity\n" + + "FROM (SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11') AS ua FROM (values(1))) AS t1"; + +testBuilder().sqlQuery(query).unOrdered() + .baselineColumns("DeviceClass", "DeviceName", "DeviceBrand","DeviceCpuBits","OperatingSystemClass", "OperatingSystemName","OperatingSystemVersion", + "OperatingSystemVersionMajor","OperatingSystemNameVersion","OperatingSystemNameVersionMajor","LayoutEngineClass","LayoutEngineName","LayoutEngineVersion", + "LayoutEngineVersionMajor","LayoutEngineNameVersion","LayoutEngineBuild","AgentClass","AgentName","AgentVersion","AgentVersionMajor","AgentNameVersionMajor", +"AgentLanguage","AgentLanguageCode","AgentSecurity") + .baselineValues("Desktop","Desktop", "Unknown","32", "Desktop", "Windows NT", "XP", "XP", "Windows XP", "Windows XP", "Browser", "Gecko", "1.8.1.11", "1", "Gecko 1.8.1.11", +"20071127", "Browser", "Firefox", "2.0.0.11", "2", "Firefox 2", "English (United States)", "en-us","Strong security") + .go(); + } + + @Test + public void testGetHostName() throws Exception { +String query = "SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11', 'AgentSecurity') AS agent FROM " + + "(values" + + "(1))"; +
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922533#comment-16922533 ] ASF GitHub Bot commented on DRILL-7343: --- KazydubB commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r320781926 ## File path: contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestUserAgentFunctions.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.categories.SqlFunctionTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterFixtureBuilder; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +@Category({UnlikelyTest.class, SqlFunctionTest.class}) +public class TestUserAgentFunctions extends ClusterTest { + + @BeforeClass + public static void setup() throws Exception { +ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher); +startCluster(builder); + } + + @Test + public void testParseUserAgentString() throws Exception { +String query = "SELECT t1.ua.DeviceClass AS DeviceClass,\n" + + "t1.ua.DeviceName AS DeviceName,\n" + + "t1.ua.DeviceBrand AS DeviceBrand,\n" + + "t1.ua.DeviceCpuBits AS DeviceCpuBits,\n" + + "t1.ua.OperatingSystemClass AS OperatingSystemClass,\n" + + "t1.ua.OperatingSystemName AS OperatingSystemName,\n" + + "t1.ua.OperatingSystemVersion AS OperatingSystemVersion,\n" + + "t1.ua.OperatingSystemVersionMajor AS OperatingSystemVersionMajor,\n" + + "t1.ua.OperatingSystemNameVersion AS OperatingSystemNameVersion,\n" + + "t1.ua.OperatingSystemNameVersionMajor AS OperatingSystemNameVersionMajor,\n" + + "t1.ua.LayoutEngineClass AS LayoutEngineClass,\n" + + "t1.ua.LayoutEngineName AS LayoutEngineName,\n" + + "t1.ua.LayoutEngineVersion AS LayoutEngineVersion,\n" + + "t1.ua.LayoutEngineVersionMajor AS LayoutEngineVersionMajor,\n" + + "t1.ua.LayoutEngineNameVersion AS LayoutEngineNameVersion,\n" + + "t1.ua.LayoutEngineBuild AS LayoutEngineBuild,\n" + + "t1.ua.AgentClass AS AgentClass,\n" + + "t1.ua.AgentName AS AgentName,\n" + + "t1.ua.AgentVersion AS AgentVersion,\n" + + "t1.ua.AgentVersionMajor AS AgentVersionMajor,\n" + + "t1.ua.AgentNameVersionMajor AS AgentNameVersionMajor,\n" + + "t1.ua.AgentLanguage AS AgentLanguage,\n" + + "t1.ua.AgentLanguageCode AS AgentLanguageCode,\n" + + "t1.ua.AgentSecurity AS AgentSecurity\n" + + "FROM (SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11') AS ua FROM (values(1))) AS t1"; + +testBuilder().sqlQuery(query).unOrdered() + .baselineColumns("DeviceClass", "DeviceName", "DeviceBrand","DeviceCpuBits","OperatingSystemClass", "OperatingSystemName","OperatingSystemVersion", + "OperatingSystemVersionMajor","OperatingSystemNameVersion","OperatingSystemNameVersionMajor","LayoutEngineClass","LayoutEngineName","LayoutEngineVersion", + "LayoutEngineVersionMajor","LayoutEngineNameVersion","LayoutEngineBuild","AgentClass","AgentName","AgentVersion","AgentVersionMajor","AgentNameVersionMajor", +"AgentLanguage","AgentLanguageCode","AgentSecurity") + .baselineValues("Desktop","Desktop", "Unknown","32", "Desktop", "Windows NT", "XP", "XP", "Windows XP", "Windows XP", "Browser", "Gecko", "1.8.1.11", "1", "Gecko 1.8.1.11", +"20071127", "Browser", "Firefox", "2.0.0.11", "2", "Firefox 2", "English (United States)", "en-us","Strong security") + .go(); + } + + @Test + public void testGetHostName() throws Exception { +String query = "SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11', 'AgentSecurity') AS agent FROM " + + "(values" + + "(1))"; +
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922491#comment-16922491 ] ASF GitHub Bot commented on DRILL-7343: --- cgivre commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r320763598 ## File path: contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestUserAgentFunctions.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.categories.SqlFunctionTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterFixtureBuilder; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +@Category({UnlikelyTest.class, SqlFunctionTest.class}) +public class TestUserAgentFunctions extends ClusterTest { + + @BeforeClass + public static void setup() throws Exception { +ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher); +startCluster(builder); + } + + @Test + public void testParseUserAgentString() throws Exception { +String query = "SELECT t1.ua.DeviceClass AS DeviceClass,\n" + + "t1.ua.DeviceName AS DeviceName,\n" + + "t1.ua.DeviceBrand AS DeviceBrand,\n" + + "t1.ua.DeviceCpuBits AS DeviceCpuBits,\n" + + "t1.ua.OperatingSystemClass AS OperatingSystemClass,\n" + + "t1.ua.OperatingSystemName AS OperatingSystemName,\n" + + "t1.ua.OperatingSystemVersion AS OperatingSystemVersion,\n" + + "t1.ua.OperatingSystemVersionMajor AS OperatingSystemVersionMajor,\n" + + "t1.ua.OperatingSystemNameVersion AS OperatingSystemNameVersion,\n" + + "t1.ua.OperatingSystemNameVersionMajor AS OperatingSystemNameVersionMajor,\n" + + "t1.ua.LayoutEngineClass AS LayoutEngineClass,\n" + + "t1.ua.LayoutEngineName AS LayoutEngineName,\n" + + "t1.ua.LayoutEngineVersion AS LayoutEngineVersion,\n" + + "t1.ua.LayoutEngineVersionMajor AS LayoutEngineVersionMajor,\n" + + "t1.ua.LayoutEngineNameVersion AS LayoutEngineNameVersion,\n" + + "t1.ua.LayoutEngineBuild AS LayoutEngineBuild,\n" + + "t1.ua.AgentClass AS AgentClass,\n" + + "t1.ua.AgentName AS AgentName,\n" + + "t1.ua.AgentVersion AS AgentVersion,\n" + + "t1.ua.AgentVersionMajor AS AgentVersionMajor,\n" + + "t1.ua.AgentNameVersionMajor AS AgentNameVersionMajor,\n" + + "t1.ua.AgentLanguage AS AgentLanguage,\n" + + "t1.ua.AgentLanguageCode AS AgentLanguageCode,\n" + + "t1.ua.AgentSecurity AS AgentSecurity\n" + + "FROM (SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11') AS ua FROM (values(1))) AS t1"; + +testBuilder().sqlQuery(query).unOrdered() + .baselineColumns("DeviceClass", "DeviceName", "DeviceBrand","DeviceCpuBits","OperatingSystemClass", "OperatingSystemName","OperatingSystemVersion", + "OperatingSystemVersionMajor","OperatingSystemNameVersion","OperatingSystemNameVersionMajor","LayoutEngineClass","LayoutEngineName","LayoutEngineVersion", + "LayoutEngineVersionMajor","LayoutEngineNameVersion","LayoutEngineBuild","AgentClass","AgentName","AgentVersion","AgentVersionMajor","AgentNameVersionMajor", +"AgentLanguage","AgentLanguageCode","AgentSecurity") + .baselineValues("Desktop","Desktop", "Unknown","32", "Desktop", "Windows NT", "XP", "XP", "Windows XP", "Windows XP", "Browser", "Gecko", "1.8.1.11", "1", "Gecko 1.8.1.11", +"20071127", "Browser", "Firefox", "2.0.0.11", "2", "Firefox 2", "English (United States)", "en-us","Strong security") + .go(); + } + + @Test + public void testGetHostName() throws Exception { +String query = "SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11', 'AgentSecurity') AS agent FROM " + + "(values" + + "(1))"; +
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917146#comment-16917146 ] ASF GitHub Bot commented on DRILL-7343: --- cgivre commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r318315055 ## File path: contrib/udfs/src/main/java/org/apache/drill/exec/udfs/UserAgentFunctions.java ## @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import io.netty.buffer.DrillBuf; +import org.apache.drill.exec.expr.DrillSimpleFunc; +import org.apache.drill.exec.expr.annotations.FunctionTemplate; +import org.apache.drill.exec.expr.annotations.Output; +import org.apache.drill.exec.expr.annotations.Param; +import org.apache.drill.exec.expr.annotations.Workspace; +import org.apache.drill.exec.expr.holders.VarCharHolder; +import org.apache.drill.exec.vector.complex.writer.BaseWriter; + +import javax.inject.Inject; + +public class UserAgentFunctions { + + @FunctionTemplate(name = "parse_user_agent", scope = FunctionTemplate.FunctionScope.SIMPLE) + + public static class UserAgentFunction implements DrillSimpleFunc { +@Param +VarCharHolder input; + +@Output +BaseWriter.ComplexWriter outWriter; + +@Inject +DrillBuf outBuffer; + +@Workspace +nl.basjes.parse.useragent.UserAgentAnalyzerDirect uaa; + +public void setup() { + uaa = nl.basjes.parse.useragent.UserAgentAnalyzerDirect.newBuilder().dropTests().hideMatcherLoadStats().build(); + uaa.getAllPossibleFieldNamesSorted(); +} + +public void eval() { + org.apache.drill.exec.vector.complex.writer.BaseWriter.MapWriter queryMapWriter = outWriter.rootAsMap(); + + String userAgentString = org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.getStringFromVarCharHolder(input); + + if (userAgentString.isEmpty() || userAgentString.equals("null")) { +userAgentString = ""; + } + + nl.basjes.parse.useragent.UserAgent agent = uaa.parse(userAgentString); + + for (String fieldName : agent.getAvailableFieldNamesSorted()) { + +org.apache.drill.exec.expr.holders.VarCharHolder rowHolder = new org.apache.drill.exec.expr.holders.VarCharHolder(); +String field = agent.getValue(fieldName); + +byte[] rowStringBytes = field.getBytes(); +outBuffer.reallocIfNeeded(rowStringBytes.length); +outBuffer.setBytes(0, rowStringBytes); + +rowHolder.start = 0; +rowHolder.end = rowStringBytes.length; +rowHolder.buffer = outBuffer; + +queryMapWriter.varChar(fieldName).write(rowHolder); + } +} + } + + @FunctionTemplate(name = "parse_user_agent", scope = FunctionTemplate.FunctionScope.SIMPLE) + + public static class UserAgentFieldFunction implements DrillSimpleFunc { +@Param +VarCharHolder input; + +@Param +VarCharHolder desiredField; + +@Output +VarCharHolder out; + +@Inject +DrillBuf outBuffer; + +@Workspace +nl.basjes.parse.useragent.UserAgentAnalyzerDirect uaa; + +public void setup() { + uaa = nl.basjes.parse.useragent.UserAgentAnalyzerDirect.newBuilder().dropTests().hideMatcherLoadStats().build(); + uaa.getAllPossibleFieldNamesSorted(); +} + +public void eval() { + String userAgentString = org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.getStringFromVarCharHolder(input); + String requestedField = org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.getStringFromVarCharHolder(desiredField); + + if (userAgentString.isEmpty() || userAgentString.equals("null")) { +userAgentString = ""; + } + + nl.basjes.parse.useragent.UserAgent agent = uaa.parse(userAgentString); + String field = agent.getValue(requestedField); Review comment: That is correct. I added a unit test for that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917144#comment-16917144 ] ASF GitHub Bot commented on DRILL-7343: --- cgivre commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r318314867 ## File path: contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestUserAgentFunctions.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.categories.SqlFunctionTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterFixtureBuilder; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +@Category({UnlikelyTest.class, SqlFunctionTest.class}) +public class TestUserAgentFunctions extends ClusterTest { + + @BeforeClass + public static void setup() throws Exception { +ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher); +startCluster(builder); + } + + @Test + public void testParseUserAgentString() throws Exception { +String query = "SELECT t1.ua.DeviceClass AS DeviceClass,\n" + + "t1.ua.DeviceName AS DeviceName,\n" + + "t1.ua.DeviceBrand AS DeviceBrand,\n" + + "t1.ua.DeviceCpuBits AS DeviceCpuBits,\n" + + "t1.ua.OperatingSystemClass AS OperatingSystemClass,\n" + + "t1.ua.OperatingSystemName AS OperatingSystemName,\n" + + "t1.ua.OperatingSystemVersion AS OperatingSystemVersion,\n" + + "t1.ua.OperatingSystemVersionMajor AS OperatingSystemVersionMajor,\n" + + "t1.ua.OperatingSystemNameVersion AS OperatingSystemNameVersion,\n" + + "t1.ua.OperatingSystemNameVersionMajor AS OperatingSystemNameVersionMajor,\n" + + "t1.ua.LayoutEngineClass AS LayoutEngineClass,\n" + + "t1.ua.LayoutEngineName AS LayoutEngineName,\n" + + "t1.ua.LayoutEngineVersion AS LayoutEngineVersion,\n" + + "t1.ua.LayoutEngineVersionMajor AS LayoutEngineVersionMajor,\n" + + "t1.ua.LayoutEngineNameVersion AS LayoutEngineNameVersion,\n" + + "t1.ua.LayoutEngineBuild AS LayoutEngineBuild,\n" + + "t1.ua.AgentClass AS AgentClass,\n" + + "t1.ua.AgentName AS AgentName,\n" + + "t1.ua.AgentVersion AS AgentVersion,\n" + + "t1.ua.AgentVersionMajor AS AgentVersionMajor,\n" + + "t1.ua.AgentNameVersionMajor AS AgentNameVersionMajor,\n" + + "t1.ua.AgentLanguage AS AgentLanguage,\n" + + "t1.ua.AgentLanguageCode AS AgentLanguageCode,\n" + + "t1.ua.AgentSecurity AS AgentSecurity\n" + + "FROM (SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11') AS ua FROM (values(1))) AS t1"; + +testBuilder().sqlQuery(query).unOrdered() Review comment: Fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917143#comment-16917143 ] ASF GitHub Bot commented on DRILL-7343: --- cgivre commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r318314836 ## File path: contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestUserAgentFunctions.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.categories.SqlFunctionTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterFixtureBuilder; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +@Category({UnlikelyTest.class, SqlFunctionTest.class}) +public class TestUserAgentFunctions extends ClusterTest { + + @BeforeClass + public static void setup() throws Exception { +ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher); +startCluster(builder); + } + + @Test + public void testParseUserAgentString() throws Exception { +String query = "SELECT t1.ua.DeviceClass AS DeviceClass,\n" + + "t1.ua.DeviceName AS DeviceName,\n" + + "t1.ua.DeviceBrand AS DeviceBrand,\n" + + "t1.ua.DeviceCpuBits AS DeviceCpuBits,\n" + + "t1.ua.OperatingSystemClass AS OperatingSystemClass,\n" + + "t1.ua.OperatingSystemName AS OperatingSystemName,\n" + + "t1.ua.OperatingSystemVersion AS OperatingSystemVersion,\n" + + "t1.ua.OperatingSystemVersionMajor AS OperatingSystemVersionMajor,\n" + + "t1.ua.OperatingSystemNameVersion AS OperatingSystemNameVersion,\n" + + "t1.ua.OperatingSystemNameVersionMajor AS OperatingSystemNameVersionMajor,\n" + + "t1.ua.LayoutEngineClass AS LayoutEngineClass,\n" + + "t1.ua.LayoutEngineName AS LayoutEngineName,\n" + + "t1.ua.LayoutEngineVersion AS LayoutEngineVersion,\n" + + "t1.ua.LayoutEngineVersionMajor AS LayoutEngineVersionMajor,\n" + + "t1.ua.LayoutEngineNameVersion AS LayoutEngineNameVersion,\n" + + "t1.ua.LayoutEngineBuild AS LayoutEngineBuild,\n" + + "t1.ua.AgentClass AS AgentClass,\n" + + "t1.ua.AgentName AS AgentName,\n" + + "t1.ua.AgentVersion AS AgentVersion,\n" + + "t1.ua.AgentVersionMajor AS AgentVersionMajor,\n" + + "t1.ua.AgentNameVersionMajor AS AgentNameVersionMajor,\n" + + "t1.ua.AgentLanguage AS AgentLanguage,\n" + + "t1.ua.AgentLanguageCode AS AgentLanguageCode,\n" + + "t1.ua.AgentSecurity AS AgentSecurity\n" + + "FROM (SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11') AS ua FROM (values(1))) AS t1"; + +testBuilder().sqlQuery(query).unOrdered() + .baselineColumns("DeviceClass", "DeviceName", "DeviceBrand","DeviceCpuBits","OperatingSystemClass", "OperatingSystemName","OperatingSystemVersion", + "OperatingSystemVersionMajor","OperatingSystemNameVersion","OperatingSystemNameVersionMajor","LayoutEngineClass","LayoutEngineName","LayoutEngineVersion", + "LayoutEngineVersionMajor","LayoutEngineNameVersion","LayoutEngineBuild","AgentClass","AgentName","AgentVersion","AgentVersionMajor","AgentNameVersionMajor", +"AgentLanguage","AgentLanguageCode","AgentSecurity") + .baselineValues("Desktop","Desktop", "Unknown","32", "Desktop", "Windows NT", "XP", "XP", "Windows XP", "Windows XP", "Browser", "Gecko", "1.8.1.11", "1", "Gecko 1.8.1.11", +"20071127", "Browser", "Firefox", "2.0.0.11", "2", "Firefox 2", "English (United States)", "en-us","Strong security") + .go(); + } + + @Test + public void testGetHostName() throws Exception { +String query = "SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11', 'AgentSecurity') AS agent FROM " + + "(values" + + "(1))"; +
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908182#comment-16908182 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r314356974 ## File path: contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestUserAgentFunctions.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.categories.SqlFunctionTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterFixtureBuilder; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +@Category({UnlikelyTest.class, SqlFunctionTest.class}) +public class TestUserAgentFunctions extends ClusterTest { + + @BeforeClass + public static void setup() throws Exception { +ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher); +startCluster(builder); + } + + @Test + public void testParseUserAgentString() throws Exception { +String query = "SELECT t1.ua.DeviceClass AS DeviceClass,\n" + + "t1.ua.DeviceName AS DeviceName,\n" + + "t1.ua.DeviceBrand AS DeviceBrand,\n" + + "t1.ua.DeviceCpuBits AS DeviceCpuBits,\n" + + "t1.ua.OperatingSystemClass AS OperatingSystemClass,\n" + + "t1.ua.OperatingSystemName AS OperatingSystemName,\n" + + "t1.ua.OperatingSystemVersion AS OperatingSystemVersion,\n" + + "t1.ua.OperatingSystemVersionMajor AS OperatingSystemVersionMajor,\n" + + "t1.ua.OperatingSystemNameVersion AS OperatingSystemNameVersion,\n" + + "t1.ua.OperatingSystemNameVersionMajor AS OperatingSystemNameVersionMajor,\n" + + "t1.ua.LayoutEngineClass AS LayoutEngineClass,\n" + + "t1.ua.LayoutEngineName AS LayoutEngineName,\n" + + "t1.ua.LayoutEngineVersion AS LayoutEngineVersion,\n" + + "t1.ua.LayoutEngineVersionMajor AS LayoutEngineVersionMajor,\n" + + "t1.ua.LayoutEngineNameVersion AS LayoutEngineNameVersion,\n" + + "t1.ua.LayoutEngineBuild AS LayoutEngineBuild,\n" + + "t1.ua.AgentClass AS AgentClass,\n" + + "t1.ua.AgentName AS AgentName,\n" + + "t1.ua.AgentVersion AS AgentVersion,\n" + + "t1.ua.AgentVersionMajor AS AgentVersionMajor,\n" + + "t1.ua.AgentNameVersionMajor AS AgentNameVersionMajor,\n" + + "t1.ua.AgentLanguage AS AgentLanguage,\n" + + "t1.ua.AgentLanguageCode AS AgentLanguageCode,\n" + + "t1.ua.AgentSecurity AS AgentSecurity\n" + + "FROM (SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11') AS ua FROM (values(1))) AS t1"; + +testBuilder().sqlQuery(query).unOrdered() Review comment: It's more common to start each chaining method call from new line: ``` testBuilder() .sqlQuery(query) .unOrdered() .baselineColumns(...) .baselineValues(...) .go(); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908179#comment-16908179 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r314358047 ## File path: contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestUserAgentFunctions.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.categories.SqlFunctionTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterFixtureBuilder; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +@Category({UnlikelyTest.class, SqlFunctionTest.class}) +public class TestUserAgentFunctions extends ClusterTest { + + @BeforeClass + public static void setup() throws Exception { +ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher); +startCluster(builder); + } + + @Test + public void testParseUserAgentString() throws Exception { +String query = "SELECT t1.ua.DeviceClass AS DeviceClass,\n" + + "t1.ua.DeviceName AS DeviceName,\n" + + "t1.ua.DeviceBrand AS DeviceBrand,\n" + + "t1.ua.DeviceCpuBits AS DeviceCpuBits,\n" + + "t1.ua.OperatingSystemClass AS OperatingSystemClass,\n" + + "t1.ua.OperatingSystemName AS OperatingSystemName,\n" + + "t1.ua.OperatingSystemVersion AS OperatingSystemVersion,\n" + + "t1.ua.OperatingSystemVersionMajor AS OperatingSystemVersionMajor,\n" + + "t1.ua.OperatingSystemNameVersion AS OperatingSystemNameVersion,\n" + + "t1.ua.OperatingSystemNameVersionMajor AS OperatingSystemNameVersionMajor,\n" + + "t1.ua.LayoutEngineClass AS LayoutEngineClass,\n" + + "t1.ua.LayoutEngineName AS LayoutEngineName,\n" + + "t1.ua.LayoutEngineVersion AS LayoutEngineVersion,\n" + + "t1.ua.LayoutEngineVersionMajor AS LayoutEngineVersionMajor,\n" + + "t1.ua.LayoutEngineNameVersion AS LayoutEngineNameVersion,\n" + + "t1.ua.LayoutEngineBuild AS LayoutEngineBuild,\n" + + "t1.ua.AgentClass AS AgentClass,\n" + + "t1.ua.AgentName AS AgentName,\n" + + "t1.ua.AgentVersion AS AgentVersion,\n" + + "t1.ua.AgentVersionMajor AS AgentVersionMajor,\n" + + "t1.ua.AgentNameVersionMajor AS AgentNameVersionMajor,\n" + + "t1.ua.AgentLanguage AS AgentLanguage,\n" + + "t1.ua.AgentLanguageCode AS AgentLanguageCode,\n" + + "t1.ua.AgentSecurity AS AgentSecurity\n" + + "FROM (SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11') AS ua FROM (values(1))) AS t1"; + +testBuilder().sqlQuery(query).unOrdered() + .baselineColumns("DeviceClass", "DeviceName", "DeviceBrand","DeviceCpuBits","OperatingSystemClass", "OperatingSystemName","OperatingSystemVersion", + "OperatingSystemVersionMajor","OperatingSystemNameVersion","OperatingSystemNameVersionMajor","LayoutEngineClass","LayoutEngineName","LayoutEngineVersion", + "LayoutEngineVersionMajor","LayoutEngineNameVersion","LayoutEngineBuild","AgentClass","AgentName","AgentVersion","AgentVersionMajor","AgentNameVersionMajor", +"AgentLanguage","AgentLanguageCode","AgentSecurity") + .baselineValues("Desktop","Desktop", "Unknown","32", "Desktop", "Windows NT", "XP", "XP", "Windows XP", "Windows XP", "Browser", "Gecko", "1.8.1.11", "1", "Gecko 1.8.1.11", +"20071127", "Browser", "Firefox", "2.0.0.11", "2", "Firefox 2", "English (United States)", "en-us","Strong security") + .go(); + } + + @Test + public void testGetHostName() throws Exception { +String query = "SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11', 'AgentSecurity') AS agent FROM " + + "(values" + + "(1))"; +
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908180#comment-16908180 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r314356394 ## File path: contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestUserAgentFunctions.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.categories.SqlFunctionTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterFixtureBuilder; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +@Category({UnlikelyTest.class, SqlFunctionTest.class}) +public class TestUserAgentFunctions extends ClusterTest { + + @BeforeClass + public static void setup() throws Exception { +ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher); +startCluster(builder); + } + + @Test + public void testParseUserAgentString() throws Exception { +String query = "SELECT t1.ua.DeviceClass AS DeviceClass,\n" + + "t1.ua.DeviceName AS DeviceName,\n" + + "t1.ua.DeviceBrand AS DeviceBrand,\n" + + "t1.ua.DeviceCpuBits AS DeviceCpuBits,\n" + + "t1.ua.OperatingSystemClass AS OperatingSystemClass,\n" + + "t1.ua.OperatingSystemName AS OperatingSystemName,\n" + + "t1.ua.OperatingSystemVersion AS OperatingSystemVersion,\n" + + "t1.ua.OperatingSystemVersionMajor AS OperatingSystemVersionMajor,\n" + + "t1.ua.OperatingSystemNameVersion AS OperatingSystemNameVersion,\n" + + "t1.ua.OperatingSystemNameVersionMajor AS OperatingSystemNameVersionMajor,\n" + + "t1.ua.LayoutEngineClass AS LayoutEngineClass,\n" + + "t1.ua.LayoutEngineName AS LayoutEngineName,\n" + + "t1.ua.LayoutEngineVersion AS LayoutEngineVersion,\n" + + "t1.ua.LayoutEngineVersionMajor AS LayoutEngineVersionMajor,\n" + + "t1.ua.LayoutEngineNameVersion AS LayoutEngineNameVersion,\n" + + "t1.ua.LayoutEngineBuild AS LayoutEngineBuild,\n" + + "t1.ua.AgentClass AS AgentClass,\n" + + "t1.ua.AgentName AS AgentName,\n" + + "t1.ua.AgentVersion AS AgentVersion,\n" + + "t1.ua.AgentVersionMajor AS AgentVersionMajor,\n" + + "t1.ua.AgentNameVersionMajor AS AgentNameVersionMajor,\n" + + "t1.ua.AgentLanguage AS AgentLanguage,\n" + + "t1.ua.AgentLanguageCode AS AgentLanguageCode,\n" + + "t1.ua.AgentSecurity AS AgentSecurity\n" + + "FROM (SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11') AS ua FROM (values(1))) AS t1"; + +testBuilder().sqlQuery(query).unOrdered() + .baselineColumns("DeviceClass", "DeviceName", "DeviceBrand","DeviceCpuBits","OperatingSystemClass", "OperatingSystemName","OperatingSystemVersion", + "OperatingSystemVersionMajor","OperatingSystemNameVersion","OperatingSystemNameVersionMajor","LayoutEngineClass","LayoutEngineName","LayoutEngineVersion", + "LayoutEngineVersionMajor","LayoutEngineNameVersion","LayoutEngineBuild","AgentClass","AgentName","AgentVersion","AgentVersionMajor","AgentNameVersionMajor", +"AgentLanguage","AgentLanguageCode","AgentSecurity") + .baselineValues("Desktop","Desktop", "Unknown","32", "Desktop", "Windows NT", "XP", "XP", "Windows XP", "Windows XP", "Browser", "Gecko", "1.8.1.11", "1", "Gecko 1.8.1.11", +"20071127", "Browser", "Firefox", "2.0.0.11", "2", "Firefox 2", "English (United States)", "en-us","Strong security") + .go(); + } + + @Test + public void testGetHostName() throws Exception { +String query = "SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11', 'AgentSecurity') AS agent FROM " + + "(values" + + "(1))"; +
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908173#comment-16908173 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r314354091 ## File path: contrib/udfs/README.md ## @@ -0,0 +1,56 @@ +# Drill User Defined Functions + +This `README` documents functions which users have submitted to Apaceh Drill. + +## User Agent Functions +Drill UDF for parsing User Agent Strings. +This function is based on Niels Basjes Java library for parsing user agent strings which is available here: https://github.com/nielsbasjes/yauaa. + +### Usage +Using this function is fairly simple. The function `parse_user_agent()` takes a user agent string as an argument and returns a map of the available fields. Note that not every field will be present in every user agent string. +``` +SELECT parse_user_agent( columns[0] ) as ua +FROM dfs.`/Users/cgivre/drill-httpd/ua.csv`; Review comment: Please use some generic source rather yours: `/tmp/data` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908178#comment-16908178 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r314354355 ## File path: contrib/udfs/README.md ## @@ -0,0 +1,56 @@ +# Drill User Defined Functions + +This `README` documents functions which users have submitted to Apaceh Drill. + +## User Agent Functions +Drill UDF for parsing User Agent Strings. +This function is based on Niels Basjes Java library for parsing user agent strings which is available here: https://github.com/nielsbasjes/yauaa. + +### Usage +Using this function is fairly simple. The function `parse_user_agent()` takes a user agent string as an argument and returns a map of the available fields. Note that not every field will be present in every user agent string. +``` +SELECT parse_user_agent( columns[0] ) as ua +FROM dfs.`/Users/cgivre/drill-httpd/ua.csv`; +``` +The query above returns: +``` +{ + "DeviceClass":"Desktop", + "DeviceName":"Macintosh", + "DeviceBrand":"Apple", + "OperatingSystemClass":"Desktop", + "OperatingSystemName":"Mac OS X", + "OperatingSystemVersion":"10.10.1", + "OperatingSystemNameVersion":"Mac OS X 10.10.1", + "LayoutEngineClass":"Browser", + "LayoutEngineName":"Blink", + "LayoutEngineVersion":"39.0", + "LayoutEngineVersionMajor":"39", + "LayoutEngineNameVersion":"Blink 39.0", + "LayoutEngineNameVersionMajor":"Blink 39", + "AgentClass":"Browser", + "AgentName":"Chrome", + "AgentVersion":"39.0.2171.99", + "AgentVersionMajor":"39", + "AgentNameVersion":"Chrome 39.0.2171.99", + "AgentNameVersionMajor":"Chrome 39", + "DeviceCpu":"Intel" +} +``` +The function returns a Drill map, so you can access any of the fields using Drill's table.map.key notation. For example, the query below illustrates how to extract a field from this map and summarize it: + +``` +SELECT uadata.ua.AgentNameVersion AS Browser, +COUNT( * ) AS BrowserCount +FROM ( + SELECT parse_user_agent( columns[0] ) AS ua + FROM dfs.drillworkshop.`user-agents.csv` +) AS uadata +GROUP BY uadata.ua.AgentNameVersion +ORDER BY BrowserCount DESC +``` +The function can also be called with an optional field as an argument. IE: Review comment: ```suggestion The function can also be called with an optional field as an argument. IE: ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908175#comment-16908175 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r314356145 ## File path: contrib/udfs/src/main/java/org/apache/drill/exec/udfs/UserAgentFunctions.java ## @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import io.netty.buffer.DrillBuf; +import org.apache.drill.exec.expr.DrillSimpleFunc; +import org.apache.drill.exec.expr.annotations.FunctionTemplate; +import org.apache.drill.exec.expr.annotations.Output; +import org.apache.drill.exec.expr.annotations.Param; +import org.apache.drill.exec.expr.annotations.Workspace; +import org.apache.drill.exec.expr.holders.VarCharHolder; +import org.apache.drill.exec.vector.complex.writer.BaseWriter; + +import javax.inject.Inject; + +public class UserAgentFunctions { + + @FunctionTemplate(name = "parse_user_agent", scope = FunctionTemplate.FunctionScope.SIMPLE) + + public static class UserAgentFunction implements DrillSimpleFunc { +@Param +VarCharHolder input; + +@Output +BaseWriter.ComplexWriter outWriter; + +@Inject +DrillBuf outBuffer; + +@Workspace +nl.basjes.parse.useragent.UserAgentAnalyzerDirect uaa; + +public void setup() { + uaa = nl.basjes.parse.useragent.UserAgentAnalyzerDirect.newBuilder().dropTests().hideMatcherLoadStats().build(); + uaa.getAllPossibleFieldNamesSorted(); +} + +public void eval() { + org.apache.drill.exec.vector.complex.writer.BaseWriter.MapWriter queryMapWriter = outWriter.rootAsMap(); + + String userAgentString = org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.getStringFromVarCharHolder(input); + + if (userAgentString.isEmpty() || userAgentString.equals("null")) { +userAgentString = ""; + } + + nl.basjes.parse.useragent.UserAgent agent = uaa.parse(userAgentString); + + for (String fieldName : agent.getAvailableFieldNamesSorted()) { + +org.apache.drill.exec.expr.holders.VarCharHolder rowHolder = new org.apache.drill.exec.expr.holders.VarCharHolder(); +String field = agent.getValue(fieldName); + +byte[] rowStringBytes = field.getBytes(); +outBuffer.reallocIfNeeded(rowStringBytes.length); +outBuffer.setBytes(0, rowStringBytes); + +rowHolder.start = 0; +rowHolder.end = rowStringBytes.length; +rowHolder.buffer = outBuffer; + +queryMapWriter.varChar(fieldName).write(rowHolder); + } +} + } + + @FunctionTemplate(name = "parse_user_agent", scope = FunctionTemplate.FunctionScope.SIMPLE) + + public static class UserAgentFieldFunction implements DrillSimpleFunc { +@Param +VarCharHolder input; + +@Param +VarCharHolder desiredField; + +@Output +VarCharHolder out; + +@Inject +DrillBuf outBuffer; + +@Workspace +nl.basjes.parse.useragent.UserAgentAnalyzerDirect uaa; + +public void setup() { + uaa = nl.basjes.parse.useragent.UserAgentAnalyzerDirect.newBuilder().dropTests().hideMatcherLoadStats().build(); + uaa.getAllPossibleFieldNamesSorted(); +} + +public void eval() { + String userAgentString = org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.getStringFromVarCharHolder(input); + String requestedField = org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.getStringFromVarCharHolder(desiredField); + + if (userAgentString.isEmpty() || userAgentString.equals("null")) { +userAgentString = ""; + } + + nl.basjes.parse.useragent.UserAgent agent = uaa.parse(userAgentString); + String field = agent.getValue(requestedField); Review comment: What behavior if requested field is absent? It returns `Unknown`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908171#comment-16908171 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r314353528 ## File path: contrib/udfs/README.md ## @@ -0,0 +1,56 @@ +# Drill User Defined Functions + +This `README` documents functions which users have submitted to Apaceh Drill. + +## User Agent Functions +Drill UDF for parsing User Agent Strings. +This function is based on Niels Basjes Java library for parsing user agent strings which is available here: https://github.com/nielsbasjes/yauaa. + +### Usage +Using this function is fairly simple. The function `parse_user_agent()` takes a user agent string as an argument and returns a map of the available fields. Note that not every field will be present in every user agent string. Review comment: ```suggestion The function `parse_user_agent()` takes a user agent string as an argument and returns a map of the available fields. Note that not every field will be present in every user agent string. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908177#comment-16908177 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r314356974 ## File path: contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestUserAgentFunctions.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.categories.SqlFunctionTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterFixtureBuilder; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +@Category({UnlikelyTest.class, SqlFunctionTest.class}) +public class TestUserAgentFunctions extends ClusterTest { + + @BeforeClass + public static void setup() throws Exception { +ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher); +startCluster(builder); + } + + @Test + public void testParseUserAgentString() throws Exception { +String query = "SELECT t1.ua.DeviceClass AS DeviceClass,\n" + + "t1.ua.DeviceName AS DeviceName,\n" + + "t1.ua.DeviceBrand AS DeviceBrand,\n" + + "t1.ua.DeviceCpuBits AS DeviceCpuBits,\n" + + "t1.ua.OperatingSystemClass AS OperatingSystemClass,\n" + + "t1.ua.OperatingSystemName AS OperatingSystemName,\n" + + "t1.ua.OperatingSystemVersion AS OperatingSystemVersion,\n" + + "t1.ua.OperatingSystemVersionMajor AS OperatingSystemVersionMajor,\n" + + "t1.ua.OperatingSystemNameVersion AS OperatingSystemNameVersion,\n" + + "t1.ua.OperatingSystemNameVersionMajor AS OperatingSystemNameVersionMajor,\n" + + "t1.ua.LayoutEngineClass AS LayoutEngineClass,\n" + + "t1.ua.LayoutEngineName AS LayoutEngineName,\n" + + "t1.ua.LayoutEngineVersion AS LayoutEngineVersion,\n" + + "t1.ua.LayoutEngineVersionMajor AS LayoutEngineVersionMajor,\n" + + "t1.ua.LayoutEngineNameVersion AS LayoutEngineNameVersion,\n" + + "t1.ua.LayoutEngineBuild AS LayoutEngineBuild,\n" + + "t1.ua.AgentClass AS AgentClass,\n" + + "t1.ua.AgentName AS AgentName,\n" + + "t1.ua.AgentVersion AS AgentVersion,\n" + + "t1.ua.AgentVersionMajor AS AgentVersionMajor,\n" + + "t1.ua.AgentNameVersionMajor AS AgentNameVersionMajor,\n" + + "t1.ua.AgentLanguage AS AgentLanguage,\n" + + "t1.ua.AgentLanguageCode AS AgentLanguageCode,\n" + + "t1.ua.AgentSecurity AS AgentSecurity\n" + + "FROM (SELECT parse_user_agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11') AS ua FROM (values(1))) AS t1"; + +testBuilder().sqlQuery(query).unOrdered() Review comment: It's more common to start each chasing call from new line: ``` testBuilder() .sqlQuery(query) .unOrdered() .baselineColumns(...) .baselineValues(...) .go(); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908170#comment-16908170 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r314352894 ## File path: contrib/udfs/README.md ## @@ -0,0 +1,56 @@ +# Drill User Defined Functions + +This `README` documents functions which users have submitted to Apaceh Drill. Review comment: ```suggestion This `README` documents functions which users have submitted to Apache Drill. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908172#comment-16908172 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r314354293 ## File path: contrib/udfs/README.md ## @@ -0,0 +1,56 @@ +# Drill User Defined Functions + +This `README` documents functions which users have submitted to Apaceh Drill. + +## User Agent Functions +Drill UDF for parsing User Agent Strings. +This function is based on Niels Basjes Java library for parsing user agent strings which is available here: https://github.com/nielsbasjes/yauaa. + +### Usage +Using this function is fairly simple. The function `parse_user_agent()` takes a user agent string as an argument and returns a map of the available fields. Note that not every field will be present in every user agent string. +``` +SELECT parse_user_agent( columns[0] ) as ua +FROM dfs.`/Users/cgivre/drill-httpd/ua.csv`; +``` +The query above returns: +``` +{ + "DeviceClass":"Desktop", + "DeviceName":"Macintosh", + "DeviceBrand":"Apple", + "OperatingSystemClass":"Desktop", + "OperatingSystemName":"Mac OS X", + "OperatingSystemVersion":"10.10.1", + "OperatingSystemNameVersion":"Mac OS X 10.10.1", + "LayoutEngineClass":"Browser", + "LayoutEngineName":"Blink", + "LayoutEngineVersion":"39.0", + "LayoutEngineVersionMajor":"39", + "LayoutEngineNameVersion":"Blink 39.0", + "LayoutEngineNameVersionMajor":"Blink 39", + "AgentClass":"Browser", + "AgentName":"Chrome", + "AgentVersion":"39.0.2171.99", + "AgentVersionMajor":"39", + "AgentNameVersion":"Chrome 39.0.2171.99", + "AgentNameVersionMajor":"Chrome 39", + "DeviceCpu":"Intel" +} +``` +The function returns a Drill map, so you can access any of the fields using Drill's table.map.key notation. For example, the query below illustrates how to extract a field from this map and summarize it: Review comment: ```suggestion The function returns a Drill map, so you can access any of the fields using Drill's table.map.key notation. For example, the query below illustrates how to extract a field from this map and summarize it: ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908176#comment-16908176 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r314355115 ## File path: contrib/udfs/src/main/java/org/apache/drill/exec/udfs/UserAgentFunctions.java ## @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import io.netty.buffer.DrillBuf; +import org.apache.drill.exec.expr.DrillSimpleFunc; +import org.apache.drill.exec.expr.annotations.FunctionTemplate; +import org.apache.drill.exec.expr.annotations.Output; +import org.apache.drill.exec.expr.annotations.Param; +import org.apache.drill.exec.expr.annotations.Workspace; +import org.apache.drill.exec.expr.holders.VarCharHolder; +import org.apache.drill.exec.vector.complex.writer.BaseWriter; + +import javax.inject.Inject; + +public class UserAgentFunctions { + + @FunctionTemplate(name = "parse_user_agent", scope = FunctionTemplate.FunctionScope.SIMPLE) + Review comment: Remove new line. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908174#comment-16908174 ] ASF GitHub Bot commented on DRILL-7343: --- arina-ielchiieva commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840#discussion_r314353595 ## File path: contrib/udfs/README.md ## @@ -0,0 +1,56 @@ +# Drill User Defined Functions + +This `README` documents functions which users have submitted to Apaceh Drill. + +## User Agent Functions +Drill UDF for parsing User Agent Strings. +This function is based on Niels Basjes Java library for parsing user agent strings which is available here: https://github.com/nielsbasjes/yauaa. Review comment: ```suggestion This function is based on Niels Basjes Java library for parsing user agent strings which is available here: https://github.com/nielsbasjes/yauaa. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (DRILL-7343) Add User-Agent UDFs to Drill
[ https://issues.apache.org/jira/browse/DRILL-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16904787#comment-16904787 ] ASF GitHub Bot commented on DRILL-7343: --- cgivre commented on pull request #1840: DRILL-7343: Add User-Agent UDFs to Drill URL: https://github.com/apache/drill/pull/1840 These UDFs add the ability to parse user agent strings, which is useful for security data analysis. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add User-Agent UDFs to Drill > > > Key: DRILL-7343 > URL: https://issues.apache.org/jira/browse/DRILL-7343 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.17.0 > > > This collection of UDFs adds the ability to parse user agent strings which is > useful for security data analysis. -- This message was sent by Atlassian JIRA (v7.6.14#76016)