Re: HBaseCon 2017
Thanks for asking Zach. June 12th, the day before the DataWorks Summit in San Jose. Google are graciously hosting. It is looking like San Francisco but may be on the Mountain View campus. CfP should go out this weekend. In general more details to follow. St.Ack On Fri, Feb 17, 2017 at 5:51 PM, Zach Yorkwrote: > Hello, > > Does anyone know if there will be a HBaseCon conference this year (and the > relative timeline)? > I'm trying to plan out different conferences that I want to attend this > year and this information would help. > > Sorry if this is not the correct place to ask, just thought I'd try here! > > Thanks, > Zach >
HBaseCon 2017
Hello, Does anyone know if there will be a HBaseCon conference this year (and the relative timeline)? I'm trying to plan out different conferences that I want to attend this year and this information would help. Sorry if this is not the correct place to ask, just thought I'd try here! Thanks, Zach
[jira] [Resolved] (HBASE-17577) Optimize file copying during backup restore operation
[ https://issues.apache.org/jira/browse/HBASE-17577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov resolved HBASE-17577. --- Resolution: Duplicate Duplicate of HBASE-17150 > Optimize file copying during backup restore operation > - > > Key: HBASE-17577 > URL: https://issues.apache.org/jira/browse/HBASE-17577 > Project: HBase > Issue Type: Task >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Fix For: HBASE-7912 > > > Currently, we copy files into TMP directory if backup destination is on the > same cluster as a source. This is because DistCp deletes src files, by > default, doing copies in the same cluster. Should be avoided. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HBASE-17150) Verify restore logic (remote/local cluster)
[ https://issues.apache.org/jira/browse/HBASE-17150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov resolved HBASE-17150. --- Resolution: Invalid after HBASE-17660 patch this code is obsolete. > Verify restore logic (remote/local cluster) > --- > > Key: HBASE-17150 > URL: https://issues.apache.org/jira/browse/HBASE-17150 > Project: HBase > Issue Type: Bug >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Fix For: HBASE-7912 > > > This part of application is a legacy code from a first version. > If backup destination is local cluster, then during restore we copy HFiles > into local temp dir first. For remote cluster we do not do this. Seems should > be other way around. > {quote} > What does this mean? > 253 2016-11-17 14:13:39,782 DEBUG [main] util.RestoreServerUtil: File > hdfs://ve0524.halxg.cloudera.com:8020/user/stack/backup/backup_1479419995738/default/x_1/archive/data/default/x_1 > on local cluster, back it up before restore > Is this a full copy of the backup to elsewhere? > 296 2016-11-17 14:13:47,907 DEBUG [main] util.RestoreServerUtil: Copied to > temporary path on local cluster: /user/stack/hbase-staging/restore > {quote} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17660) HFileSplitter is not being applied during full table restore
Vladimir Rodionov created HBASE-17660: - Summary: HFileSplitter is not being applied during full table restore Key: HBASE-17660 URL: https://issues.apache.org/jira/browse/HBASE-17660 Project: HBase Issue Type: Bug Reporter: Vladimir Rodionov Assignee: Vladimir Rodionov Fix For: HBASE-7912 HFileSplitter M/R job splits snapshot files into a given region boundaries before moving them using bulk load tool. The current code for restore full table backup does not utilize this job. Should be fixed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Re: Online Meeting on embryonic FS Redo Project (HBASE-14439)
Thanks for the updates! I will review when I have time. On 2/17/17, 4:16 PM, "Umesh Agashe"wrote: Hi, Here is the doc that summarizes our discussion about why we think top-down approach requiring radical code changes compared to incremental, phased (bottom-up) approach will help us REDO of FS directory layout. https://docs.google.com/document/d/128Q0BqJY7OvHMUpEpZWKCaBrH1qDjpxxOVkX2KM46No/edit#heading=h.iyja9q78fh2j Thanks, Umesh On Fri, Feb 17, 2017 at 12:57 PM, Stack wrote: > Notes from this morning's online meeting @10AM PST (please fill in any > detail I missed): > > IN ATTENDANCE: > Aman Poonia > Umesh Agashe, Cloudera > Stephen Tak, AMZ > Zach York, AMZ > Francis Liu, Yahoo! > Ben Mau, Yahoo! > Sean Busbey, Cloudera > Ted Yu, HWX > Appy (Apekshit Sharma), Cloudera > > > BACKGROUND (St.Ack) > Y! want to do millions of regions in a Cluster. > Our current FS Layout heavily dependent on HDFS semantics (e.g. we depend > heavily on HDFS rename doing atomic file and directory swaps); complicates > being able to run on another FS. > HBase is bound to a particular physical layout in the FS. > Matteo Bertozzi experience with HDFS/HBase on S3 and a general irritation > with how FS ops are distributed all about the codebase had him propose a > logical tier with a radically simplified set of requirements of underlying > FS (block store?); atomic operations would be done by HBase rather than > farmed out to the FS. > Matteo not w/ us anymore but he passed on the vision to Umesh > > CURRENT STATE OF FS REDO PROJECT (Umesh) > Currently it is shelved but hope to get back to it 'soon'. > Spent a few months on FS REDO at end of last year. > Initial approach was to abstract out three Interfaces (original sketched by > Matteo in [1]). > Idea was to centralize all FS use in a few well-known locations. > Then refactor all FS usage. > Keep all meta data about tables, files, etc., in hbase:meta > Idea was to slowly migrate over ops, tools etc., to the new Interface. > This was a bottom-up approach, finding FS references, and moving references > to one place. > Soon found too many refs all over the code. > Found that we might not get to desired simple Interface because API had to > carry around baggage. > Matteo had tried this approach in [1] and started to argue this stepped > migration would never arrive. > > So restarted over w/ the ideal Simple FS Interface and the implementation > seemed to flow smoothly. > An in-memory POC that did simple file ops was posted a while back here [2]. > > Given the two approaches taken above, experience indicates that the > radical, top-down approach is more likely to succeed. > > WHY ARE PEOPLE INTERESTED IN FS REDO? > Francis and Ben Mau, we want to be able to do 1M regions. > St.Ack suggested that even small installs need to be able to do more, > smaller regions. > Zach is interested because wants to optimize HBase over S3 (rename, > consistency issues). Liked the idea of metadata up in hbase;meta table and > avoiding renames, etc. > > WHAT SHOULD WE DO? > We have few resources. It is a big job (We've been talking about it a good > while now). All docs are stale missing benefit of Umesh recent > explorations. > Sean pointed out that before shelving, the idea was to try the PoC > Interface against a new hbase operation other than simple file reading and > writing (compactions?). If the PoC Interface survived in the new context, > we'd then step back and write up a design. > Seemed like as good a plan as any. Plan should talk about all the ways in > which ops can go wrong. > Thereafter, split up the work and bring over subsystems. > It is looking like hbase3 rather than hbase2 project (though all hoped it > could make an hbase2). > > TODOs > We agreed to post these notes with pointers to current state of FS REDO > (See below). > Umesh and Stack to do up a one-pager on current PoC to be posted on this > thread or up in the FS REDO issue (HBASE-14090). > Keep up macro status on this thread. > > What else? > Thanks, > S > > 1. Matteo's original FS REDO suggested plan: https://docs.google.com/ > document/d/1fMDanYiDAWpfKLcKUBb1Ff0BwB7zeGqzbyTFMvSOacQ/edit# > 2. Umesh's PoC: https://reviews.apache.org/r/55200/ > 3. HBASE-14090 is the parent issue for this project? > 4. An old doc. to evangelize the idea of an FS REDO (mostly upstreaming > Matteo's ideas): https://docs.google.com/document/d/ > 10tSCSSWPwdFqOLLYtY2aVFe6iCIrsBk4Vqm8LSGUfhQ/edit# > > > On Fri, Feb 17, 2017 at 9:53 AM, Stack
Re: Successful: HBase Generate Website
I pushed the website with below patch. FYI, S On Fri, Feb 17, 2017 at 7:02 AM, Apache Jenkins Server < jenk...@builds.apache.org> wrote: > Build status: Successful > > If successful, the website and docs have been generated. To update the > live site, follow the instructions below. If failed, skip to the bottom of > this email. > > Use the following commands to download the patch and apply it to a clean > branch based on origin/asf-site. If you prefer to keep the hbase-site repo > around permanently, you can skip the clone step. > > git clone https://git-wip-us.apache.org/repos/asf/hbase-site.git > > cd hbase-site > wget -O- https://builds.apache.org/job/hbase_generate_website/491/ > artifact/website.patch.zip | funzip > 7763dd6688254d37ad611f5d290db4 > 7c83cf93d3.patch > git fetch > git checkout -b asf-site-7763dd6688254d37ad611f5d290db47c83cf93d3 > origin/asf-site > git am --whitespace=fix 7763dd6688254d37ad611f5d290db47c83cf93d3.patch > > At this point, you can preview the changes by opening index.html or any of > the other HTML pages in your local > asf-site-7763dd6688254d37ad611f5d290db47c83cf93d3 > branch. > > There are lots of spurious changes, such as timestamps and CSS styles in > tables, so a generic git diff is not very useful. To see a list of files > that have been added, deleted, renamed, changed type, or are otherwise > interesting, use the following command: > > git diff --name-status --diff-filter=ADCRTXUB origin/asf-site > > To see only files that had 100 or more lines changed: > > git diff --stat origin/asf-site | grep -E '[1-9][0-9]{2,}' > > When you are satisfied, publish your changes to origin/asf-site using > these commands: > > git commit --allow-empty -m "Empty commit" # to work around a current > ASF INFRA bug > git push origin asf-site-7763dd6688254d37ad611f5d290db4 > 7c83cf93d3:asf-site > git checkout asf-site > git branch -D asf-site-7763dd6688254d37ad611f5d290db47c83cf93d3 > > Changes take a couple of minutes to be propagated. You can verify whether > they have been propagated by looking at the Last Published date at the > bottom of http://hbase.apache.org/. It should match the date in the > index.html on the asf-site branch in Git. > > As a courtesy- reply-all to this email to let other committers know you > pushed the site. > > > > If failed, see https://builds.apache.org/job/hbase_generate_website/491/ > console
Re: Online Meeting on embryonic FS Redo Project (HBASE-14439)
Notes from this morning's online meeting @10AM PST (please fill in any detail I missed): IN ATTENDANCE: Aman Poonia Umesh Agashe, Cloudera Stephen Tak, AMZ Zach York, AMZ Francis Liu, Yahoo! Ben Mau, Yahoo! Sean Busbey, Cloudera Ted Yu, HWX Appy (Apekshit Sharma), Cloudera BACKGROUND (St.Ack) Y! want to do millions of regions in a Cluster. Our current FS Layout heavily dependent on HDFS semantics (e.g. we depend heavily on HDFS rename doing atomic file and directory swaps); complicates being able to run on another FS. HBase is bound to a particular physical layout in the FS. Matteo Bertozzi experience with HDFS/HBase on S3 and a general irritation with how FS ops are distributed all about the codebase had him propose a logical tier with a radically simplified set of requirements of underlying FS (block store?); atomic operations would be done by HBase rather than farmed out to the FS. Matteo not w/ us anymore but he passed on the vision to Umesh CURRENT STATE OF FS REDO PROJECT (Umesh) Currently it is shelved but hope to get back to it 'soon'. Spent a few months on FS REDO at end of last year. Initial approach was to abstract out three Interfaces (original sketched by Matteo in [1]). Idea was to centralize all FS use in a few well-known locations. Then refactor all FS usage. Keep all meta data about tables, files, etc., in hbase:meta Idea was to slowly migrate over ops, tools etc., to the new Interface. This was a bottom-up approach, finding FS references, and moving references to one place. Soon found too many refs all over the code. Found that we might not get to desired simple Interface because API had to carry around baggage. Matteo had tried this approach in [1] and started to argue this stepped migration would never arrive. So restarted over w/ the ideal Simple FS Interface and the implementation seemed to flow smoothly. An in-memory POC that did simple file ops was posted a while back here [2]. Given the two approaches taken above, experience indicates that the radical, top-down approach is more likely to succeed. WHY ARE PEOPLE INTERESTED IN FS REDO? Francis and Ben Mau, we want to be able to do 1M regions. St.Ack suggested that even small installs need to be able to do more, smaller regions. Zach is interested because wants to optimize HBase over S3 (rename, consistency issues). Liked the idea of metadata up in hbase;meta table and avoiding renames, etc. WHAT SHOULD WE DO? We have few resources. It is a big job (We've been talking about it a good while now). All docs are stale missing benefit of Umesh recent explorations. Sean pointed out that before shelving, the idea was to try the PoC Interface against a new hbase operation other than simple file reading and writing (compactions?). If the PoC Interface survived in the new context, we'd then step back and write up a design. Seemed like as good a plan as any. Plan should talk about all the ways in which ops can go wrong. Thereafter, split up the work and bring over subsystems. It is looking like hbase3 rather than hbase2 project (though all hoped it could make an hbase2). TODOs We agreed to post these notes with pointers to current state of FS REDO (See below). Umesh and Stack to do up a one-pager on current PoC to be posted on this thread or up in the FS REDO issue (HBASE-14090). Keep up macro status on this thread. What else? Thanks, S 1. Matteo's original FS REDO suggested plan: https://docs.google.com/ document/d/1fMDanYiDAWpfKLcKUBb1Ff0BwB7zeGqzbyTFMvSOacQ/edit# 2. Umesh's PoC: https://reviews.apache.org/r/55200/ 3. HBASE-14090 is the parent issue for this project? 4. An old doc. to evangelize the idea of an FS REDO (mostly upstreaming Matteo's ideas): https://docs.google.com/document/d/ 10tSCSSWPwdFqOLLYtY2aVFe6iCIrsBk4Vqm8LSGUfhQ/edit# On Fri, Feb 17, 2017 at 9:53 AM, Stackwrote: > I put up a hangout. If above link doesn't work, try this > https://hangouts.google.com/call/aaahkufdurgctflufw4ivhsngue and write > here if can't get in. > > St.Ack > > On Tue, Feb 14, 2017 at 12:36 PM, Stack wrote: > >> A few folks want to have a quick chat about the state of the proposed FS >> redo project. The proposal is for 10AM, this Friday morning, PST. All >> interested parties are invited to join (shout if 10AM PST is untenable and >> suggest an alternative). Below is a google hangout link that comes alive >> friday morning [1]. >> >> One of us will keep notes and post synopsis of discussion back here and >> in issue after the meeting is done. >> >> Suggest those who join try to do some background reading -- see >> HBASE-14439 -- so we are all around the same level of understanding when >> the meeting starts. Agenda will be a basic intros, current state of the >> project (with update on most recent effort), and then expectations. Basic. >> >> Thanks, >> S >> >> 1. https://plus.google.com/hangouts/_/calendar/c2FpbnQuYWNrQ >> GdtYWlsLmNvbQ.1oaqlr00ru20s1hqrsq1q05j3k?authuser=0 >> > >
Re: looking for reviews on small security patches
done. -- Cloudera, Inc. On Fri, Feb 17, 2017 at 5:50 AM, Sean Busbeywrote: > Hi folks! > > I'm hoping to get reviews on these two issues: > > Unvalidated Redirect in HMaster > https://issues.apache.org/jira/browse/HBASE-15328 > > table status page should escape values that may contain arbitrary > characters. > https://issues.apache.org/jira/browse/HBASE-17561 > > > I'd like to use some time over the long weekend to get a new 1.2.5 > release candidate posted, but I'd like to see these two issues closed > out first. >
Re: Online Meeting on embryonic FS Redo Project (HBASE-14439)
I put up a hangout. If above link doesn't work, try this https://hangouts.google.com/call/aaahkufdurgctflufw4ivhsngue and write here if can't get in. St.Ack On Tue, Feb 14, 2017 at 12:36 PM, Stackwrote: > A few folks want to have a quick chat about the state of the proposed FS > redo project. The proposal is for 10AM, this Friday morning, PST. All > interested parties are invited to join (shout if 10AM PST is untenable and > suggest an alternative). Below is a google hangout link that comes alive > friday morning [1]. > > One of us will keep notes and post synopsis of discussion back here and in > issue after the meeting is done. > > Suggest those who join try to do some background reading -- see > HBASE-14439 -- so we are all around the same level of understanding when > the meeting starts. Agenda will be a basic intros, current state of the > project (with update on most recent effort), and then expectations. Basic. > > Thanks, > S > > 1. https://plus.google.com/hangouts/_/calendar/c2FpbnQuYWNrQGdtYWlsLmNvbQ. > 1oaqlr00ru20s1hqrsq1q05j3k?authuser=0 >
[jira] [Resolved] (HBASE-17659) How to connect to hbase hdfs filesystem
[ https://issues.apache.org/jira/browse/HBASE-17659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dima Spivak resolved HBASE-17659. - Resolution: Not A Problem Please use the user mailing list for questions about getting HBase up and running. > How to connect to hbase hdfs filesystem > --- > > Key: HBASE-17659 > URL: https://issues.apache.org/jira/browse/HBASE-17659 > Project: HBase > Issue Type: Task > Components: API >Affects Versions: 0.94.7 >Reporter: Jenson Luke >Priority: Blocker > > I am not able to connect to HBASE hdfs file system. When I run my Java > program through server, it is picking up the local file system instead of > hdfs file system. > while Running through Server, I am passing only > conf = HBaseConfiguration.create(); > fs = FileSystem.get(this.conf); > Path tabledir = new Path(fs.makeQualified(new > Path(conf.get(HConstants.HBASE_DIR))), tableName); > It is giving the value of tabledir as > "/tmp/hbase-hbase/hbase/tsdb-uid_jentab_bkp1_scen06" > My actual hdfs path is "hdfs://ibdash-.xx.xx..net:8020/hbase". -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Successful: HBase Generate Website
Build status: Successful If successful, the website and docs have been generated. To update the live site, follow the instructions below. If failed, skip to the bottom of this email. Use the following commands to download the patch and apply it to a clean branch based on origin/asf-site. If you prefer to keep the hbase-site repo around permanently, you can skip the clone step. git clone https://git-wip-us.apache.org/repos/asf/hbase-site.git cd hbase-site wget -O- https://builds.apache.org/job/hbase_generate_website/491/artifact/website.patch.zip | funzip > 7763dd6688254d37ad611f5d290db47c83cf93d3.patch git fetch git checkout -b asf-site-7763dd6688254d37ad611f5d290db47c83cf93d3 origin/asf-site git am --whitespace=fix 7763dd6688254d37ad611f5d290db47c83cf93d3.patch At this point, you can preview the changes by opening index.html or any of the other HTML pages in your local asf-site-7763dd6688254d37ad611f5d290db47c83cf93d3 branch. There are lots of spurious changes, such as timestamps and CSS styles in tables, so a generic git diff is not very useful. To see a list of files that have been added, deleted, renamed, changed type, or are otherwise interesting, use the following command: git diff --name-status --diff-filter=ADCRTXUB origin/asf-site To see only files that had 100 or more lines changed: git diff --stat origin/asf-site | grep -E '[1-9][0-9]{2,}' When you are satisfied, publish your changes to origin/asf-site using these commands: git commit --allow-empty -m "Empty commit" # to work around a current ASF INFRA bug git push origin asf-site-7763dd6688254d37ad611f5d290db47c83cf93d3:asf-site git checkout asf-site git branch -D asf-site-7763dd6688254d37ad611f5d290db47c83cf93d3 Changes take a couple of minutes to be propagated. You can verify whether they have been propagated by looking at the Last Published date at the bottom of http://hbase.apache.org/. It should match the date in the index.html on the asf-site branch in Git. As a courtesy- reply-all to this email to let other committers know you pushed the site. If failed, see https://builds.apache.org/job/hbase_generate_website/491/console
[jira] [Created] (HBASE-17659) How to connect to hbase hdfs filesystem
Jenson Luke created HBASE-17659: --- Summary: How to connect to hbase hdfs filesystem Key: HBASE-17659 URL: https://issues.apache.org/jira/browse/HBASE-17659 Project: HBase Issue Type: Task Components: API Affects Versions: 0.94.7 Reporter: Jenson Luke Priority: Blocker I am not able to connect to HBASE hdfs file system. When I run my Java program through server, it is picking up the local file system instead of hdfs file system. while Running through Server, I am passing only conf = HBaseConfiguration.create(); fs = FileSystem.get(this.conf); Path tabledir = new Path(fs.makeQualified(new Path(conf.get(HConstants.HBASE_DIR))), tableName); It is giving the value of tabledir as "/tmp/hbase-hbase/hbase/tsdb-uid_jentab_bkp1_scen06" My actual hdfs path is "hdfs://ibdash-.xx.xx..net:8020/hbase". -- This message was sent by Atlassian JIRA (v6.3.15#6346)
looking for reviews on small security patches
Hi folks! I'm hoping to get reviews on these two issues: Unvalidated Redirect in HMaster https://issues.apache.org/jira/browse/HBASE-15328 table status page should escape values that may contain arbitrary characters. https://issues.apache.org/jira/browse/HBASE-17561 I'd like to use some time over the long weekend to get a new 1.2.5 release candidate posted, but I'd like to see these two issues closed out first.