Re: Flume to Phoenix as Sink Issue

2014-12-21 Thread Divya Nagarajan
Hi Ravi Kiran,

   Really Thank You for your reply it worked well.I could see Apache
data in phoenix.
Do you have any idea  when will Apache Phoenix Give support for UNION
Statement.
We are eagerly waiting for it .Apache Phoenix is really a good tool and
very useful.

Thanks a Lot !!

Divya N

On Mon, Dec 22, 2014 at 11:33 AM, Ravi Kiran 
wrote:

> Hi Divya,
>
>   Based on the logs you have shared, can you please change the following
> entries
>
> agent.sinks.phoenix-sink.serializer.regex=^([\\d.]+) (\\S+) (\\S+)
> \\[([\\w:/]+\\s[+\\-]\\d{4})\\] \"(.+?)\" (\\d{3}) (\\d+) \"([^\"]+)\"
> \"([^\"]+)\"
>
> agent.sinks.phoenix-sink.serializer.columns=host,identity,user,time,request,status,size,referer,agent
>
> Regarding changing the logging level , try changing the entry within
> log4j.properties and give it a try.
>
> Regards
> Ravi
>
> On Sat, Dec 20, 2014 at 4:34 AM, Divya Nagarajan 
> wrote:
>
>> Hi,
>>
>> This is my Flume Configuration File
>>
>> agent.sources = tail
>> agent.channels = memoryChannel
>> agent.sinks = loggerSink
>> agent.sinks = phoenix-sink
>>
>> agent.sources.tail.type = exec
>> agent.sources.tail.command = tail -f /var/log/httpd/access_log
>> agent.sources.tail.channels = memoryChannel
>>
>> agent.sinks.loggerSink.channel = memoryChannel
>> agent.sinks.loggerSink.type = logger
>>
>> agent.channels.memoryChannel.type = memory
>> agent.channels.memoryChannel.capacity = 100
>>
>> agent.sinks.phoenix-sink.type=org.apache.phoenix.flume.sink.PhoenixSink
>> agent.sinks.phoenix-sink.channel=memoryChannel
>> agent.sinks.phoenix-sink.batchSize=5
>> agent.sinks.phoenix-sink.table=S1.APACHE
>>
>> agent.sinks.phoenix-sink.zookeeperQuorum=nn01
>> agent.sinks.phoenix-sink.serializer=REGEX
>> agent.sinks.phoenix-sink.serializer.rowkeyType=uuid
>> agent.sinks.phoenix-sink.ddl=CREATE TABLE IF NOT EXISTS S1.APACHE (uid
>> varchar NOT NULL,host varchar,identity varchar,user varchar,time
>> varchar,method varchar,request varchar,protocol varchar,status INTEGER,size
>> INTEGER,referer varchar,agent varchar,f_host varchar CONSTRAINT pk PRIMARY
>> KEY (uid))
>>
>> #agent.sinks.phoenix-sink.serializer.regex="([^ ]*) ([^ ]*) ([^ ]*)
>> (-|\\[[^\\]]*\\]) \"([^ ]+) ([^ ]+) ([^\"]+)\" (-|[0-9]*) (-|[0-9]*)(?: ([^
>> \"]*|\"[^\"]*\") ([^ \"]$
>> #agent.sinks.phoenix-sink.serializer.regex="([^ ]*) ([^ ]*) ([^ ]*)
>> (-|\\[[^\\]]*\\]) \"([^ ]+) ([^ ]+) ([^\"]+)\" (-|[0-9]*) (-|[0-9]*)(?: ([^
>> \"]*|\"[^\"]*\") ([^ \"]*|\"[^\"]*\"))?"
>>
>>
>> agent.sinks.phoenix-sink.serializer.regex=([^ ]*) ([^ ]*) ([^ ]*)  ([^ ]*
>> [^ ]*) "([^\"]+)\" (-|[0-9]*) (-|[0-9]*) "([^ ]*)" "([^\"]+)\"
>>
>> agent.sinks.phoenix-sink.serializer.columns=host,identity,user,time,method,request,protocol,status,size,referer,agent
>> agent.sinks.phoenix-sink.serializer.headers=f_host
>>
>>
>> This Is my Apache log File Structure
>>
>> 127.0.0.1 - - [20/Dec/2014:17:11:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>> 127.0.0.1 - - [20/Dec/2014:17:16:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>> 127.0.0.1 - - [20/Dec/2014:17:21:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>> 127.0.0.1 - - [20/Dec/2014:17:26:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>> 127.0.0.1 - - [20/Dec/2014:17:31:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>> 127.0.0.1 - - [20/Dec/2014:17:36:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>> 127.0.0.1 - - [20/Dec/2014:17:41:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>> 127.0.0.1 - - [20/Dec/2014:17:46:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>> 127.0.0.1 - - [20/Dec/2014:17:51:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>> 127.0.0.1 - - [20/Dec/2014:17:56:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
>> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>>
>>
>> Iam using
>> phoenix 4.2.1
>> Hbase 0.98.8
>>
>> and Sorry i enable DEBUG Mode in flume . it shows only INFO as usual when
>> executing this
>> flume-ng agent -c conf -f /opt/flume/conf/apache.conf -n agent
>> -Dflume.root.looger=DEBUG,console
>>
>> Thanks
>> Divya N
>>
>>
>>
>> On Sat, Dec 20, 2014 at 2:14 AM, Ravi Kiran 
>> wrote:
>>
>>> Hi Divya,
>>>
>>>Also, can you confirm if the regex given in the configuration matches
>>> the access log . To confirm , is it possible to set the logging level to
>>> debug as there is debug log entry if the event doesn't match the regex
>>> given in the configuration.
>>>   We have a test case for processing apache logs
>>> https://github.com/apache/phoenix/blob/master/phoenix-flume/src/it/java/org/apache/phoenix/flume/RegexEventSerializerIT.java#testApacheLogRegex
>>> which can help you with the regex
>>>   Happy to help!!
>>>
>>> Regards
>>> Ravi
>>>
>>> On Fri, Dec

Re: Flume to Phoenix as Sink Issue

2014-12-21 Thread Ravi Kiran
Hi Divya,

  Based on the logs you have shared, can you please change the following
entries

agent.sinks.phoenix-sink.serializer.regex=^([\\d.]+) (\\S+) (\\S+)
\\[([\\w:/]+\\s[+\\-]\\d{4})\\] \"(.+?)\" (\\d{3}) (\\d+) \"([^\"]+)\"
\"([^\"]+)\"
agent.sinks.phoenix-sink.serializer.columns=host,identity,user,time,request,status,size,referer,agent

Regarding changing the logging level , try changing the entry within
log4j.properties and give it a try.

Regards
Ravi

On Sat, Dec 20, 2014 at 4:34 AM, Divya Nagarajan 
wrote:

> Hi,
>
> This is my Flume Configuration File
>
> agent.sources = tail
> agent.channels = memoryChannel
> agent.sinks = loggerSink
> agent.sinks = phoenix-sink
>
> agent.sources.tail.type = exec
> agent.sources.tail.command = tail -f /var/log/httpd/access_log
> agent.sources.tail.channels = memoryChannel
>
> agent.sinks.loggerSink.channel = memoryChannel
> agent.sinks.loggerSink.type = logger
>
> agent.channels.memoryChannel.type = memory
> agent.channels.memoryChannel.capacity = 100
>
> agent.sinks.phoenix-sink.type=org.apache.phoenix.flume.sink.PhoenixSink
> agent.sinks.phoenix-sink.channel=memoryChannel
> agent.sinks.phoenix-sink.batchSize=5
> agent.sinks.phoenix-sink.table=S1.APACHE
>
> agent.sinks.phoenix-sink.zookeeperQuorum=nn01
> agent.sinks.phoenix-sink.serializer=REGEX
> agent.sinks.phoenix-sink.serializer.rowkeyType=uuid
> agent.sinks.phoenix-sink.ddl=CREATE TABLE IF NOT EXISTS S1.APACHE (uid
> varchar NOT NULL,host varchar,identity varchar,user varchar,time
> varchar,method varchar,request varchar,protocol varchar,status INTEGER,size
> INTEGER,referer varchar,agent varchar,f_host varchar CONSTRAINT pk PRIMARY
> KEY (uid))
>
> #agent.sinks.phoenix-sink.serializer.regex="([^ ]*) ([^ ]*) ([^ ]*)
> (-|\\[[^\\]]*\\]) \"([^ ]+) ([^ ]+) ([^\"]+)\" (-|[0-9]*) (-|[0-9]*)(?: ([^
> \"]*|\"[^\"]*\") ([^ \"]$
> #agent.sinks.phoenix-sink.serializer.regex="([^ ]*) ([^ ]*) ([^ ]*)
> (-|\\[[^\\]]*\\]) \"([^ ]+) ([^ ]+) ([^\"]+)\" (-|[0-9]*) (-|[0-9]*)(?: ([^
> \"]*|\"[^\"]*\") ([^ \"]*|\"[^\"]*\"))?"
>
>
> agent.sinks.phoenix-sink.serializer.regex=([^ ]*) ([^ ]*) ([^ ]*)  ([^ ]*
> [^ ]*) "([^\"]+)\" (-|[0-9]*) (-|[0-9]*) "([^ ]*)" "([^\"]+)\"
>
> agent.sinks.phoenix-sink.serializer.columns=host,identity,user,time,method,request,protocol,status,size,referer,agent
> agent.sinks.phoenix-sink.serializer.headers=f_host
>
>
> This Is my Apache log File Structure
>
> 127.0.0.1 - - [20/Dec/2014:17:11:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
> 127.0.0.1 - - [20/Dec/2014:17:16:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
> 127.0.0.1 - - [20/Dec/2014:17:21:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
> 127.0.0.1 - - [20/Dec/2014:17:26:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
> 127.0.0.1 - - [20/Dec/2014:17:31:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
> 127.0.0.1 - - [20/Dec/2014:17:36:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
> 127.0.0.1 - - [20/Dec/2014:17:41:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
> 127.0.0.1 - - [20/Dec/2014:17:46:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
> 127.0.0.1 - - [20/Dec/2014:17:51:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
> 127.0.0.1 - - [20/Dec/2014:17:56:06 +0530] "GET / HTTP/1.0" 403 4954 "-"
> "check_http/v2.0.3 (nagios-plugins 2.0.3)"
>
>
> Iam using
> phoenix 4.2.1
> Hbase 0.98.8
>
> and Sorry i enable DEBUG Mode in flume . it shows only INFO as usual when
> executing this
> flume-ng agent -c conf -f /opt/flume/conf/apache.conf -n agent
> -Dflume.root.looger=DEBUG,console
>
> Thanks
> Divya N
>
>
>
> On Sat, Dec 20, 2014 at 2:14 AM, Ravi Kiran 
> wrote:
>
>> Hi Divya,
>>
>>Also, can you confirm if the regex given in the configuration matches
>> the access log . To confirm , is it possible to set the logging level to
>> debug as there is debug log entry if the event doesn't match the regex
>> given in the configuration.
>>   We have a test case for processing apache logs
>> https://github.com/apache/phoenix/blob/master/phoenix-flume/src/it/java/org/apache/phoenix/flume/RegexEventSerializerIT.java#testApacheLogRegex
>> which can help you with the regex
>>   Happy to help!!
>>
>> Regards
>> Ravi
>>
>> On Fri, Dec 19, 2014 at 11:19 AM, Ravi Kiran 
>> wrote:
>>>
>>> Hi Nagarajan,
>>>
>>> Do you see any exceptions in the logs ? Can you please give it a try
>>> to ingest > 100 records and see if that works.  Also, can  you please share
>>> the version of Phoenix you are using.
>>>
>>> Regards
>>> Ravi
>>>
>>> On Thu, Dec 18, 2014 at 10:36 PM, Divya Nagarajan <
>>> divya.se2...@gmail.com> wrote:


 H i,
I tried with 5 as batchsize,still data is not upserted into phoenix.

Re: Re: What is the purpose of these system tables(CATALOG, STATS, and SEQUENCE)?

2014-12-21 Thread James Taylor
Like I said before, no it's not ok to drop system tables. If for some
reason you don't want the sequence table presplit 256 ways, you can
set the phoenix.sequence.saltBuckets to specify how many pre-split
regions you'd like it to have (including setting it to 0).



On Sun, Dec 21, 2014 at 5:38 PM, chenwenhui  wrote:
> Hi James,
> Thanks for your reply.
> The SYSTEM.SEQUENCE contains 256 regions by default, it looks like a large
> number.
> I ever tried to drop the table, but find that the sequence function became
> no-effect.  My application should not use the  sequence function for ever,
> are there other side-effect if dropping the SYSTEM.SEQUENCE table?
> If existing other side-effect indeed, how to reduce the region number?
> Thank again.
>
>
>
>
>
>
> At 2014-12-20 15:13:33, "James Taylor"  wrote:
>>Hi,
>>The system tables store and manage your metadata (i.e. tables, their
>>columns, views, sequences, indexes, etc.). You should leave them
>>alone. Phoenix manages (reads/writes) to these tables when necessary.
>>Thanks,
>>James
>>
>>On Thu, Dec 18, 2014 at 6:30 PM, chenwenhui  wrote:
>>> Do nobody almost care these system tables?
>>>
>>>
>
>
>


Re:Re: What is the purpose of these system tables(CATALOG, STATS, and SEQUENCE)?

2014-12-21 Thread chenwenhui
Hi James,
Thanks for your reply.
The SYSTEM.SEQUENCE contains 256 regions by default, it looks like a large 
number.
I ever tried to drop the table, but find that the sequence function became 
no-effect.  My application should not use the  sequence function for ever, are 
there other side-effect if dropping the SYSTEM.SEQUENCE table?
If existing other side-effect indeed, how to reduce the region number?
Thank again.








At 2014-12-20 15:13:33, "James Taylor"  wrote:
>Hi,
>The system tables store and manage your metadata (i.e. tables, their
>columns, views, sequences, indexes, etc.). You should leave them
>alone. Phoenix manages (reads/writes) to these tables when necessary.
>Thanks,
>James
>
>On Thu, Dec 18, 2014 at 6:30 PM, chenwenhui  wrote:
>> Do nobody almost care these system tables?
>>
>>